var, val and Immutability

Why ‘var’ or ‘val’ at all?

TL;DR:  adding the ‘var’ (or val) is clearer than python syntax

Python programs traditionally just set variables to a value, not fuss no extra step.  Creating the variable by giving it a value uses the same syntax as setting an existing variable to a new value. Using the same syntax does come with problems, and understanding those problems, can help understand why kotlin has ‘var’ and ‘val’, and how they can help.

Firstly, a typo or misspelling can create a new variable when the intention was to set an existing variable to a new value.  But even without misspellings or typos, it can be useful when reading the code to be able to see immediately if a new variable is being created, or an existing variable reused.  Sometimes, it can get really hard to tell which is happening.

Consider this python code:


#traditional python

def outer(a):
   b = 5
   def inner1():
      return b + 3
   def inner2():
      b = 2
      return b + 3

# python with types
def outer(a:int)-> int:
    b:int = 5
    def inner1()->int:
       return b + 3
    def inner2()-> int:
       b:int = 2
       return b + 3

Function ‘inner1’ uses the variable ‘b’.  This variable comes from the scope of the function ‘outer’.  Function ‘inner2’ also uses variable ‘b’, but in this case, because there is an assignment to variable ‘b’ in that inner scope, this is a different ‘b’ from the outer scope ‘b’.  This can be quite confusing, as inner2 is similar to inner1 and it is logical to expect again the outer ‘b’ is being used.  The ‘python with types’ example does make the it clear there is a new declaration to the reader,  but only if the programmer already realises what is happening and writes the assignment to ‘b’ with the ‘:int’.   If the programmer actually wanted to, both inner1 and inner2 use the variable ‘b’ from outer scope, then a new variable is being created without the programmer knowing.  I suspect almost every python programmer who has nested functions has been tripped up at least once with this type of problem.

This potential confusion is all a consequence of the fact that in python there is no difference between declaring a variable, and reusing a variable.

fun inner2():Int{  // first version
   var b = 2
   return b + 3
}
fun inner2():Int{ // second version
   b = 2
   return b + 3
}

I left out the ‘outer’ level to draw attention to the ‘var b = 2’ vs ‘b =2’.  The first version has a new local ‘b’ (like the python with an assignment), the second version uses ‘outer’ b, even though there is an assignment.  The value of the ‘var’ is that there is a clear indication of ‘new variable’ vs ‘existing variable’.

Using types with python can help, but makes more code than with kotlin.

Python has recently allowed for the optional specification of type, kotlin requires the specification of type, but with less code than specifying type with python.  In fact in may examples, kotlin manages to be as concise as python with still specifying type. Kotlin achieves this by inferring type, and it is difficult to see how that could ever by provided in python.  The strength of python is you do not need to specify type at all. If the program is just for yourself, then exceptions if things go wrong are better than having to write extra code.  Kotlin is a language for when you write programs to be run by other people, python is perfect for when the programs are for yourself.

summary

declaring a variable can be the same as python, but with the keyword ‘var’ making it clear that this is a declaration, not the reuse of an existing variable. In python, the ‘:’ is optional, and an aid to both readability and tools like ‘mypy’, but does not change the code. In kotlin, the ‘:’ is needed when the language cannot learn the type from the value being assigned. ‘var a = 3’ is setting ‘a’ to an integer, so ‘a’ must be an integer. With ‘var a:Int = 3’, like python, the type is not needed, with ‘var a:Int? = 3’ the type is needed, because this is saying ‘a’ may be either int (as it is initially) or ‘null’.

var vs val: An new concept?

So declaring a variable can be the same as python, but with the keyword  ‘var’ making it clear that this is a declaration, not the reuse of an existing variable.  This explains ‘var’. ‘val’ declares a new variable just like ‘var’, however in this case the variable cannot be reused. The variable cannot appear on the left a future ‘=’ statement.

This is somewhat similar to simply using capitalised names in python to indicate a constant.  However, a ‘val’ just means that ‘variable’ (or value) will always reference the same object, it does not ensure that object will not change.  See immuatbility below.

Immutability

Consider in python if ‘a’ is set to a list, then the list can change and thus the value of ‘a’ change, without setting ‘a’ itself to a new value

>>> a = []
>>> b = a  # set b to same list
>>> a.append(5)  #change the list, but a is still referencing same list
>>> print(a)
[ 5 ]  # a is not the same as it was
>>> print(b)
[ 5 ]  # b is still the same list as a, so has the same change

‘a’  and ‘b’ both could be ‘val’ in kotlin as they are not assigned new values, even though the actual value they reference does change.  The change with kotlin is that there is MutableList (the equivalent of the python ‘list’) as well as the regular ‘List’ which is immutable so cannot change.

In python, all the basic types: int str bool float etc., are ‘immutable’.  I cannot reproduce the ‘side effect’ above that can happen with a ‘list’, with any of those basic types.  If I set ‘a’ to a string, then that string cannot change.  I can only set ‘a’ to new string, which will leave the orginal string unchanged.  If ‘b’ is set to ‘a’ with a string, ‘b’ will not change because while i can change which string it is that ‘a’ references, the strings themselves are immutable.

In python all basic types are immutable, but collections: lists, sets, dictionaries and their derived classes, are mutable. In kotlin each of these collection classes is immutable, but has a Mutable alternative class.

Advertisements

Null Safety

 

To java programmers, ‘Null Safety’ can sound like the ‘Holy Grail’. To python programmers, ‘Null Safety’ can sound like solving a problem they never had. Truth is, the ‘null’ (or None) problem is still there for python programs, but the problem arises as a very varied problems, no the single.  Python has ‘None’ which is in many ways more like the kotlin ‘Unit’ with no actual ‘none’.

Python with types.

Consider the python code in the previous section, where the first line reads “a: int”.  This code is at best, a little strange in any language as it says “the variable called ‘a’ will be an int, but currently has no value”. How can an ‘int’ have no value? The reality is in python we could set our variable to ‘None’ to indicate it has no value, but then ‘None’ is not an int ‘type(None)’.  So to set this new variable to ‘None’ in python, it needs to a be “int or None” and to write that in python (which is very new for python)  we would need “a: Union[int, None]”.  Far easier to simply give a value to our new variable, as the code does with ‘b’.  Line 2, “b:int = 6” declares ‘b’ and sets ‘b’ to 6. Simple in both languages.  So now we have learnt that the lines:

a = None # python, the usual way!
a: Union[int, None] #python with types
var a: Int? // kotlin

Are all the same.  The first example is typical python.  The second line is python using types, (as technically could be a:Optional[int] – but the union is perhaps more self explanatory), and the third option is kotlin. In kotlin,  “Int?” is saying the python “either int or None”, but in kotlin.   The kotlin code is nice, but almost all current python code would simply set the variable ‘a’ to ‘None’, and not bother declaring the types. In python, normally the programmer just has to remember “I am using ‘a’ as int, but it could be a None”.  The programmer themselves has to remember ‘a’ could be None.

So far, kotlin has made a nicer way to say something most python code does not bother to say.  Yes, it is a good idea to document in some way that this ‘a’ should be an ‘int’, but beware it can also be None, and the kotlin is probably a better way to document what is happening, but how does this help?

Null safety is that not only have we documented that our variable can be “None” or “null”,  but that the language can help us avoid getting errors.  Kotlin does this by making it super each to check for ‘null’ and reminds us to check if we have forgotten.  This can avoid many a program error.

mem2.pngConsider a diagram similar to the background diagram of variables. In kotlin, ‘a’ is pointing nowhere, while in python ‘a’ would be pointing to the ‘None’ object. But in either case, if we try and use ‘a’ as an integer, we will get an exception.

We could, for example, in a specific situation wish to print ‘a’,  but in that situation with to use 10 for the cases where ‘a’ is ‘None’.  The code would be:

print(a if a is not None else 10) # puts what we are printing first ...but!
# or ....
print(10 if a is None else a)	# python and shorter

println(a ?: 10)  // kotlin

The kotlin is very readable, and we get a warning if we forget to deal with ‘null’ (or None). Even the shorter  python version is longer than the kotlin code, and neither python version is ideal. The shorter version hides that we are normally printing (‘a’) and the longer version has a messy ‘not None else’ but still tells the story better than the shorter code. But the main point is that if we forget that ‘a’ could be ‘None’, then this is only discovered in testing, and then only if we remember to test that case. Kotlin reminds us automatically that ‘a’ could be null, and ensures the code encounters no problem.

One more example:

a = None # for python 3.6 could use "a:Union(str, None) = None"
.... (some code that might set a, but might leave not set)
print(0 if a is None else len(a))  # python
a: String? = null
......(some code that might set a to a new string)
println(a?.length ?: 0)  // kotlin

Again kotlin will ensure we deal with the possibility that ‘a’ is null and will prevent code that could break at run time, plus it makes that code very easy to write.

The end result is that it becomes far easier, and safer, to code solutions where ‘null’ represents ‘has no real value at this time’.  An important technique becomes easier to use, and safe when it is used.

Basics: From Python to Kotlin

Python is an interpreted language, often described as a ‘scripting language’, while Kotlin is a compiled language, targeting robust development of complex applications. While both have features that allow them to expand beyond their origins, it is understanding these origins and the thinking behind them that give best understanding of their basic differences. The two languages start from almost opposite sides of the ‘static compiler  vs dynamic interpreter’ worlds, but each have adopted thing from the other side of the debate so that both now are in the middle.  But starting point still shows.

The topics covered here are:

  • language origin
  • installing
  • Run kotlin with Java
  • kotlin reads your code
  • comments and quotes
  • variables and types

Language Origin.

The origins of python date back to 1989, but the early versions circulated between very few people.  Python 2 was releases in 2000 and as you may gather from the fact that no one ever speaks today of ‘Python1’ that before python 2 distribution was quite limited. This allow Python to evolve prior to being locked into a large user base and is generally seen as a good thing.  Python started very much as an interpreted ‘scripting language’, but clearly has become a general purpose programming language. Python was invented by Guido van Rostrum who is still the BDFL (benevolent dictator for life) but is developed by a strong open source community.  Guido named ‘Python’ after the comedy troop ‘monty python’ which suggest the language has a fun aspect.

The origins of kotlin are that is started life with a ‘we need something better than java’ brief back in 20 and is influenced by the need for the language to produce code that runs on the JVM (java virtual machine). Version 1.0 was released in February 2016.  Jetbrains, the team behind kotlin, had a huge codebase that could really benefit from a better language.  After trying Scala, they then “cherry picked’ the best language features they could find.  Java was named after a large tropical island half way around the workd from the Java developers, famous for growing coffee.  To follow that the kotlin team decided to also choose and island and they chose a cold island in the Baltic sea, famous for a class of warship being named after it, and the closest island to their offices.  They may lack humour and imagination, or perhaps they love irony.  Which, only time will tell.  Perhaps it is a good time to review the page ‘why kotlin?

Installing

To run python programs, the first step is to install python from python.org. This also gives you a very basic development tool in the for of ‘idle’. You can run programs directly from that tool, but once the program is developed, it can be run on any computer where python is installed. Just copy the source file to the other computer. In reality,  if continuing to develop with python, a better ide is needed.

For kotlin, best to first download a version of JDK.  Then download Intelij from jetbrains.

See here for getting a first app running.

Run kotlin with Java.

There are tutorials here and other places to get your first kotlin app running. With the first app, you will normally run it inside Intelij, but how to in that program without the IDE?  With python, share the python.py file(s) to another computer is all that is needed, but there is no easy choice but to share the source code.   To run the kotlin program from the command line, you need to first use kotlin to produce a ‘class’ or ‘jar’ file, then run that file using java.  I will add more instructions here later…if you need such instructions, please leave a comment.

Kolin only reads your code.

Python, as follows from the scripting orgin,  runs each line of you program as it reads that line.  Your program runs as if it was in ‘idle’ the repl.  From top to bottom of the file, each line is executed immediately as it is read.  If there is a ‘def’, the function is defined by that def as python reads your program.  Statements outside a ‘def’ happen immediately. This is the same even if you import a second file….. any code in that second file outside of ‘def’ statements, will be run immediately.

If a file has the line “a = 5” in the file, even if you just import the file, ‘a’ will be set to ‘5’ immediately when python reaches that line.

Kotlin does nothing.  You could consider that “all the code is inside ‘fun’ statements so it is like python with ‘def’ statements, but it goes further than that.  With kotlin there is a complete ‘compile all code’ phase, and then a separate ‘run’ at the end.

A different hello world.

The first difference from the python ‘hello world’ is that the code for kotlin must be inside a ‘fun’ called ‘main.  Think of this as the equivalent of:

if __name__ == "__main__":
    print("hello world")
#// equivalent to
fun main(args : Array) {
    println("hello world")
}

In python, the ‘if’ is needed to stop the code running if not main, in kotlin the ‘fun main’ is needed to say the code is ‘main’.

Comments and Quotes

Quite obvious the comment in kotlin is /* to start and */ to end, with ‘//’ the equivalent of ‘#’

The equivalent to the docstring is a comment, not a string, and the comment starts with /**  …for more see here

In python you can use ‘single’ quote strings, or “double” quote strings, and “””triple quote”” for longer strings or r”raw strings” and from python 3.6) f”format strings”.

In kotlin, single quotes are not strings they are single character literals, which have no python equivalents.  All “regular” double quote strings are format strings, “””triple quotes””” are both the equivalent of the python “”” and the python r”raw” string, but also still allow templates.  see here for more

Variables and Types

As explained in the background, python has no static variables, with actual values held in ‘dynamic’ memory, and variables just pointing to the values. Since the only static information for every variable is a pointer, all variables themselves are exactly the same in Python, and use the same amount of memory, and it is only the data they are pointed to that can be different. This means no type is needed, as every variable will be the same.   As of python 3.6, you could write:

a: int
b: int = 6

def myfun(arg1: int, arg2: str) -> int:

But all the types in there actually do nothing in python itself and ‘mypy’ is needed to give you warnings.  The first statement “a: int” actually does nothing at all, and if the second was “b: int = ‘abc'” it would still work.  Almost no python at this time is written with all these types, but they have been added because it is felt that having them (or at least the option) could make python better.

In koltin, every variable has a type. The code becomes:

var a: Int?
var b: Int = 6

fun myfun(arg1: Int, arg2: String) : Int{

This is almost the same as the python, except that as we said, almost no python is actually written this way.  Note that in kotlin, all classes, even the built in classes have the consistent leading capital.  This is because in python, ‘int’ and ‘str’ and other primitive types were not actually classes when the language started.

Variable names are normally camelCaseNames instead of names_with_underscores.  This does make the names slightly shorter but is a matter of preference.

Because in kotlin there are static types available, which better allows for high performance code or even producing native code, it is essential that the language knows the type.  It is clear that even some python programmers would like code to know the type also,  and many regard the type being clear a very good thing. There are many situations dealing with types that are new for python programmers and this will be covered in future pages.

Variables and Objects

Introduction

The page follows on from the languages page, which reviewed basic concepts of the principles of how programming languages work, by further exploring how ‘variables’ programming languages work.  Understanding this, helps understand the philosophies of python and kotlin.

In this page, the language ‘c’ is discussed, but there is no need to know how to program in ‘c’ any more than there was a need to program in machine code or assembler for the last chapter.  It is just that explaining how ‘c’ works allows explaining some simple examples.

Topics:

  • Variables and ‘static memory’.
  • Variables and dynamic memory
  • Objects and memory.
  • More about objects.

Variables and Static memory.

In a language such as the original ‘C’ language, a byte variable requires a single byte of memory to be reserved for that variable. The original ‘int’ variable required 2 bytes, and a string required one byte for each character plus one extra byte to mark the end of the string. If a string was to hold a variable number of characters, then the variable could be declared (as in ‘char name[80];’ to declare a variable called name able to hold 80 char-acters) with the maximum allowed size, and then either a separate variable declared to record the current length of ‘name’, or a special character at the end to indicate the end of the characters in use.

A new concept added in ‘c’ was the ‘pointer variable’.  This variable would not hold program information such as a ‘char’ or an ‘int’ or a string, but would hold the number of the memory location (the ‘address’) where information is stored. Consider two ‘int’ variables ‘a’ and ‘b’ and a pointer variable ‘p’. The variable ‘p’ could hold the address of ‘a’ or ‘b’.  Operations could operate on ‘p’, or they could operate on the memory location pointed to by ‘p’.  This allows the same code to using ‘p’ to read or change either ‘a’ or ‘b’ depending on where ‘p’ is ‘pointing’ at the time.

All of these variables, including the pointer type, require a fixed amount of memory so the compiler could calculate the size of the block of memory to hold all the variables.  These variable could be called ‘static’ variables as they use the same place in memory throughout the program. Every time a variable is declared, the space needed to hold that variable is reserved by the language.

A program with only static variables only requires a very low powered computer chip, and is the program in your toaster is very likely to work with only static variables.

Variables and dynamic memory: malloc().

Now consider the more complex problem of a linked list, with a variable number of elements. We could simply declare a variable with the maximum possible size for the most extreme circumstance for the program.  This ‘maximum ever’ size would likely almost never eventuate, meaning that normally there a lot of wasted memory.

In ‘c’ the answer can come from a the ‘malloc’ function,  which manages a pool of memory space, and allows allocating space from that came from a function q’ (memory allocate) where the program could ask the operating system for a block of memory of a given size at run time.  The memory could then vary according to requirements.  When the memory is allocated, the location of the memory could be stored in a ‘pointer variable’ and the program do whatever it wishes with the pointer. This for the first time allows true variable length variables. If we are finished with our list we can call ‘free’ to allow the space to be reused for any other data.  We could later call malloc again asking for a different size, and this time our data may be in a different place.  This makes our data ‘dynamic’ as where in memory our data is located, and how much space the data uses can change.

From the diagram, it can be seen that our static ‘int’ variable ‘a’ is held in static memory, but the dynamic ‘int’ variable ‘b’ is actually held in dynamic memory, so getting the value is a two step process.  First get the pointer, then use the pointer to get the value.  Dynamic variables make the program run slower, but they give greater flexibility.

Objects and Memory

Now consider objects. Objects are instanced at run time, so the easiest way to handle them is by simply holding a ‘pointer’ in static memory and creating the object in dynamic memory and then pointing to the object.  A simple test for this in python is to try this code:

>>> c = [ 1, 2, 3 ]
>>> d = c
>>> c[1] = 4
>>> print(d)
[1, 4, 3]

It become clear that like that wonderful hand drawn diagram, both ‘c’ and ‘d’ share the same memory, so changing ‘c’ also changes ‘d’.

Every variable in python is in fact a ‘pointer’ or ‘reference’ to an object in dynamic memory. No exceptions.  You may wonder “but if i set ‘a = 3’ and ‘b=a’ then add ‘2’ to a ‘a = a + 2’, then ‘b’ does not change!”.  This is because ‘a = a + 2’ creates a new dynamic memory value ‘5’ and then ‘a’ is set to point to this new value, while b still points to ‘3’.

More on this in ‘basics’.

Languages: Compilers and Interpreters

LT;DR: check the headings, only read headings of interest

What is Machine code.

The actual CPU (central processing unit), the “intel i7” or “arm-9” chip, has a ‘native code’ instruction set, the ‘machine code’.  These instructions have no concept of variables, or ‘for loops’, or even characters. Everything is reduced to a number (an ‘A’ is normally reduced to the number 65).

Instructions are like “move from this numbered memory location to this register”, “shift this register left 3 bits”, and “mov this value to the instruction pointer”(last last one is a ‘jump’).  On a 16 bit computer, that ‘mov’ instruction for ‘register 7’ would have a specific number of ‘bits’ of the 16 bit instruction indicating the instruction, another group of ‘bits’ within the instruction with the value of the register ‘0111’ (for register 7) and the next 16 bits of memory supplying the number of memory location. In machine code the ‘mov value from the memory location following this instruction to register 7’ because simply a number.

Even those who best understand machine code do not write programs in machine code, except as a learning exercise. If fact as a learning exercise I highly recommend it, and as long as you don’t try and write a program that will actually do very much it can be fun.

 

Assemblers, Compilers and Interpreters

Writing a program that actually does something significant in ‘machine code’ is simple an inefficient use of time.

Assembler: names for machine instructions and memory locations

In fact even when speaking of ‘machine code’ what is normally assumed is an assembler.  An ‘assembler’ is a program that reads a ‘source file’ and produces actual machine code, in format that can be read by a ‘loader’ which is program to load the machine code into the computer memory.

Assembler code gives the instructions names in place of numbers, and combines the ‘register number 7’ part into the ‘move from memory into register’ instruction to make a single number, as in machine code the register number is just part of the overall number or ‘instruction code’.  Memory locations are then given names.  The result is a slightly readable program, and using the assembler is what we actually mean when we write machine language, as no one works with the actual zeros and ones to write their program.

Compilers: the jump to today’s languages

The next level of moving away from the machine instructions and into programs with some meaning to humans is  the ‘compiler’.

Where the assembler allows names for machine instructions and their options and translates them into the actual binary instruction code, a compiler translates ‘abstract commands’ (designed to be more readable for humans than machine instructions), into the sequence of machine instructions needed to be produce the same result.  Unlike assembler, the programmer is no longer aware of, or has control over, actual machine code instructions.

While the assembler simply gives names to memory locations , a compiler allows reserving memory for a specific purpose and applying rules to how that are of memory will be used by the program.

int a = 3

would be a line of code reserving the correct amount of space in memory for an ‘int’ and declaring the name ‘a’ for referencing that memory.  It would also generate the machine code instructions to load a value of ‘3’ and store this value in that allocated memory.

A compiler takes a program from the language the compiler works with, and translates that program into the machine code instructions to perform the actions described by the program.  After compilation, the machine code program produced by that translation (or compilation) will then need to be run on the computer to test the result.

Consider a speech in a foreign language.  Compiling a program is equivalent to getting a transcript of the speech, and sending it off for translation.  When the translated speech comes back you can read it.

Understanding Program interpreters.

Consider that speech just referred at the end of ‘compilers’.  If you have an interpreter available, as each sentence is spoken you can have it translated immediately.  At the end of the speech, you already have heard it all.  If you wish to clarify anything 3

An interpreter is a program that reads each line of code, and in place of creating the machine code equivalent, the interpreter itself is a program what the programs says. Allocates an area of its own memory the variable, sets its value, or does what ever else is in the program.  This makes an interpreter more interactive than a compiler, but you never get a machine code translation, and cannot every run the program by itself.  The program does not run on the computer, it runs on the interpreter.  But any computer with an interpreter can run the program.

The real world: blending compilers and interpreters.

No compiler actually produces machine code instructions for every thing a program must do.  Instead, for some operations the compiler outputs machine code simply call a premade function to do what the program wants.  Consider ‘print’ as an example. A compiler will usually generate machine code to get the message to be printed ready, then add code to call the standard print routine.  The resulting program needs to either include that standard print routine and all other ‘standard’ routines, or will need a library of those routines to be available on the computer for the program to run.  These standard routines are like pieces of the interpreter: routines to do the equivalent of what the program says.

A new way in which compiler languages now can seem like interpreter languages is through the ‘REPL’ (Read Evaluate Print Loop) where individual lines of a program can be entered and run one at a time.

Interpreters also adopt some ideas from compilers.  When the complete program is available, starting the interpretation from the beginning each time makes little sense.  Often interpreters can translate the file to a ‘pre-processed’ format, which is not machine code but just pre-checked code or ‘intermediate code’ which is error checked and easier and faster to interpret.