Scheduling: Schedule, design, code?

Or perhaps: schedule, design, build?  Either way, while that sequence may sound like waterfall, even agile is really a repetition of this sequence over and over.  This page discusses why these 3 steps are problematic for software, but still must be followed despite problems, and how to schedule projects using this sequence that initially can appear to be broken.


Build Vs Engineer: Which Best Describes Software Development?

To be able to schedule the completion of a task, it is useful to consider the nature of the task and the nature of the steps within a task.

Design: Engineering Design vs Artistic Design.

The verb design can be interpreted in subtly different ways.  The design of the Rubik’s Cube by Erno Rubik is an example of engineering design, not of artistic design. But someone can make a new Rubik’s cube with a new artistic design. When Engineering design is consider  novel the designer can apply for a patent, while with artistic design the designer considers their work protected by copyright.  The nature of the intellectual property created by different design types is sufficiently different that there are different ways to protect that intellectual property.

The engineering process as describe here gives some insight.  Note the iteration, with no way of being certain how many iterations are required to produce a result.  The implication is it is impossible to be certain how long the process will take.   However,  in most cases, with artistic design, a single iteration is sufficient, making the process easier to predict.  Depending on the task, software design can be a about artistic design, engineering design or mixture of both. In reality, the design tasks are almost always engineering design.

Generally software is written because software to do the task does not already exist. Unlike with a chair that you might build because you need another chair, software is generally no written because you need an exact copy of software that already exists.  This means there should always be engineering design required, which in turn suggests there could be a highly variable time to predict how long the process will take.

Engineering Design: A tautology?

Engineering is generally described by the dictionary as both design and building. The dictionary definition suggests a waterfall sequence, while the flow from an engineering perspective clearly conveys a more agile approach.  Regardless of the sequence, there is both design and build, however for engineering the substance is in the design.  Consider the phrases ‘engineer a bridge’ and ‘build a bridge’  as we use them today. To engineer a bridge is to either design the bridge or some detail of the bridge, while to build a bridge implies no design work is required.  As we use ‘engineer’ in this sense today, an engineering project is considered complete when the design is complete.   Building that is necessary to test the design or to be ready to test the design could be part of engineering. Once a design is complete, in todays world we then normally use ‘construction’ or simply ‘building’ and keep ‘engineering’ for when there is a new design.

Construction vs Engineering.

While both ‘construction projects’ and ‘engineering projects’ can involve building something and will need a design- the construction project is primarily all about the building, and the and the engineering project is all about the design.  A construction company will often outsource the design to an engineering company when custom design is required, and then commence construction with all design work complete.

Consider building Swedish designed flat pack furniture.   All that is required by the consumer is building or construction.  Engineering is what took place at head office to produce and test the original design. But even within that original engineering project there a simple construction components, because the design of that component already exists, and is already tested.

Construction is where the overall design already exists and is already considered tested. Engineering is where the overall design is not yet tested, although there will normally be components of the overall design that are tested.

Software Development: Construction or Engineering?

The waterfall development process works on the principle that the software development can work like construction, and all design is already complete and effectively tested.

In contrast, agile is based on considering software as an engineering project, and that even components within that engineering project are themselves engineering projects.

Waterfall Advantage: Scheduling

As in a construction industry project, there is still the initial engineering/design phase which can be very difficult to schedule. With the construction industry the main costs and the major time for the project all take place one the design is agreed, so the variable timing of the shorter, lower cost engineering/design phase is not that significant to the overall project.

Construction itself is following known steps, so it can (at least in theory) be accurately scheduled and costed.  In theory, applying this model to software could provide not only accurate overall scheduling, but with a set of discrete steps, accurate tracking of project progress.

Waterfall Advantages: Skill Diversity and Low Cost

Again consider the construction industry.  The design phase requires architects and engineers who are highly educated and costly, however most of the actual work can be done by labourers who are less expensive labour.  The construction phase still needs project managers and foremen, but few in proportion to the labourers.

Applying this same model to software results in software architects and business analysts for the design phase, and while the actual development will require project managers and team leaders, these will be few in relation to more junior programmers, and these programmers can possibly be outsourced or even off shored to lower costs.

The advantages and disadvantages of Agile

Agile has one simply advantage:  the engineering metaphor it implements actually does apply to software.  This gives the advantage of actually confronting reality, while the waterfall approach has attractions,  as a model for software it is fundamentally flawed since design is required during the ‘construction’ phase.  The result is any supposed advantages are not realistic.

The disadvantage of facing reality is that 1) it become clear that it is impossible to guarantee a schedule for a design that is not yet complete, and the design will not be complete until the software is complete, and 2) the skill diversity approach is broken as with design needed at all stages the low cost ‘labourers’ will still need to make design decisions and the impact of these decisions is significant.  So no exact scheduling of construction, and the separation of roles to analysis, programmers, project managers is flawed and at the very least needs each skill in every team, if not every team member.

Summary: There are question marks around whether software can really be reduced to the waterfall model, and that would be the only way to enable reliable time estimates.  The implication is that in project discoveries must impact schedules and cannot be avoided.  Something has to give.

Rethink: Scheduling engineering vs scheduling construction.

The Problem.

The mindset of construction is that at the project outset it becomes known what is to be done, and then good planning will result in an accurate timeline for the project.

However software cannot generally be properly reduced to the construction model,  leaving an engineering approach where it is never known exactly what is to be done until the goal is which point the project is basically complete.

The Proven Solution.

Consider car manufacturers.  They construct cars. In advance of manufacture they know very accurately what the specification of each car to be built will be, and how long to build each car.

However, now consider the engineering aspect of a car manufacturer: designing new cars to then be manufactured.  Design, prototyping and tooling.  As an engineering task, on the basis of theory described here, the time to design a new car to a given set of specifications can be estimated (educated guess), but there is insufficient information for an accurate figure.  The solution: something has to give.  Either the exact specification, or the exact amount of time must be adjustable.  Given many manufactures desire a new model each year, the time is not adjustable, so the result is the specification becomes flexible.

A fixed amount of resources is allocated to engineer improvements to the current model.  All improvement ready in time become part of the new model,  There is a list of desired improvements and ideas for improvements and these can be prioritised. The limitation is that the exact set of features that will be ready for the next model is not known at the start of the fixed length project. Only those that can be completed in time will be in the next model. Usually there will be more features and improvements identified as desirable than can be ready in time for that next model.  A list of features and improvements which are thought can be ready for the next model is made, and work starts on the list.  If a feature/improvement is not ready in time, it will have to wait as the release date for the next model cannot be pushed back.he projects include is

Applying the solution to software.

Agile allows taking the industrial engineering approach, and applying it to software.  Projects like Ubuntu Linux and now even Windows 10 releases industry have new software releases at fixed intervals.  The product versions even are based on dates and those dates are declared at the project outset.  There are have been two Ubuntu versions every year since October 2004 (4.10 Warty Warthog). One version in April, with the version number of the year and then ‘.04’ (April is the 4th month so 5.04 was in April of 2005) and then another in October with ‘.10’ for the tenth month. How do they keep such reliable schedules? Features that do not make the deadline are pushed back to the next version, that’s how.

Scheduling and Scrum.

The scrum process is like a series of mini-releases, with the completion of each sprint resulting in a new set of stories, tasks and bug fixes ready working tested and integrated into the system. Scrum planning can take the same type of approach with issues not able to be complete in time pushed back to a future scrum.

The danger is that if not managed correctly, an individual issue could absorb the team for entire time allocated to the sprint.  If that issue still cannot be completed, then there is a sprint with nothing completed.  The solution is to budget time for each issues.  When that budgeted time has elapsed, the issue should be reviewed.

The choices for the review are

  1. divide this issue into parts and push what cannot be done in this sprint back to the backlog (which could even be the entire issue)
  2. push other issues to the backlog to free time in this sprint for another allocation of time to this issue
  3. both of the above.  Push part of the issue to the backlog, but still allow a new block of time for this now simplified issue and push other issues to the backlog to free up time for the now reduced issue

Conclusion: Schedule, design, code.

Sprints should be set with fixed end dates, or at least end dates that will have only a small window of variation.  As the window approaches, new tasks are pushed back in place of being started when there is insufficient time and only tasks near completion can affect the sprint close date within a predetermined window.

Each task can be scheduled before the task is started, and the schedule should allow a high probability the task will be complete, but this cannot be guaranteed. At the end of the schedule time the task should be reviewed.  Either this task is pushed back, another task is pushed back, or the sprint will be late.

So it turns out that the sequence: Schedule, Design, Code, can work.


TDD or Not TDD? That is the question!

What actually is TDD (Test Driven Development) ? Is TDD Dead?

Do you associate this term for when Tests actually Drive Development,  or use the label TDD for the practice of ensuring code coverage by having units tests? TDD can be taken to mean different things than the original meaning, and there are some risks from that shift in meaning.

I recently was searching online for discussion on TDD, and was surprised to find many pages describing TDD as simply ensuring unit tests are in place, but then other pages using TDD to refer to when Test actually Drive Development.  This difference in definition result in considerable confusion.

This page looks at what people is accepted as best practice today, how that fits with the original meaning of TDD, and the dangers and problems that do, and have already, resulted from a shift in meaning of TDD, what is dead and what is not dead.



Unit Test

It is generally assumed that a reader of this page will know what a ‘unit test’ is, but for clarity, a unit test a is program function that sets up specific inputs and then calls a target software ‘unit’  in order to verify the output of the target software unit is as expected, when given those specific inputs.  A software unit could be a function, a class or a module or even an overall software package.

Unit Tests

‘Unit Tests’, plural, or perhaps even clearer (but longer) a ‘unit test suite’ denotes a set of unit tests that should contain sufficient individual tests to infer that a software ‘unit’ will perform as expected, for each possible combination of inputs that the software unit under test could be expected to encounter in normal use.

TDD (Test Driven Development)

There is no universal agreed meaning of TDD.  There is the original meaning by Kent Beck, and some say even Kent has changed ideas as we all do, but the original meaning is the only one in a book, so on this page I will tend use that original meaning, except where I specifically discuss how people take TDD to mean something different.

From the original meaning, TDD is using tests to drive development. Such tests are specifically created not to form a test suite, but to enable software design and development. Some tests created during Test Driven Development are useful for a test suite, some may become redundant once software has been developed, and the TDD process does not automatically result in a complete set of Unit Tests.

Assertion Test.

This is a term introduced here, and can help reading this page if nothing else.  Unit tests can have one or more assertions. These assertions should together make a cohesive Unit test and that is discussed on another page. In the following examples, Uncle Bob sometimes says he is adding a new unit test, when in fact he then adds a new assertion to an existing unit test.  How many assertions does it take to make a unit test? Ideally one, but in real world it may take more.  When this page refers to an assertion test, it is a an individual (assertion) component of a unit test, and it could be confusing to describe that as a unit test.

Common to both TDD tests & Unit Tests (Test Suites)

Tests: The Only Real Specification

What does a program actually do? It passes the tests.

Any other specification is what someone believes the program should do, not what the program actually does.

A program is measured by its tests, and the result of those tests are the only real specifications.  Confusingly, sometimes design goals are described as specifications.

If you consider the specification of a camera, or car. Almost all specifications are established by measuring the values that are specifications, eg. engine power in horse power or kilowatts.  Certainly, the measured value may match the value that was the design goal, but for example if the car had a design goal of engine power 110kw but actually is measured to produce 105kw, it is only the measured value, not the design goal, which can be quoted as the product specification.  If the design goal was quoted as a specification, a customer would feel mislead.

A program is measured by its tests, and the result of those tests is the real specification.

Easily Repeatable Automated Tests Are Best.

Some code is difficult to test automatically. How do you test a function with a program that prints for example?  For some code it is simply far easier to run the program a see what prints.  In almost all cases, a system redesign to allow an automated Unit Test is the only satisfactory solution.  Unit tests can even be presented as a system specification.

A Failing Test Before Any Production Code.

No code should ever be written without first predetermining what the code should do.  This simply means do not start a task without first deciding what constitutes completing that task. For unit tests, add the unit test before the code is in place (if the production code already exists, still run the test before including the code in the system). For TDD as originally proposed, the test should be added before the solution has been determined.

TDD vs Unit Tests

A TDD Example with ‘Uncle Bob’

The following video of a talk by Uncle Bob is very useful, but quite long, so the main points will be discussed here without needing to watch entire video. Consider now the  video from 24m05s through to 42m:00s.

A total of 10 assertion tests are created.  The first 9 assertion tests are best described as TDD tests, with the 10th test the only actual unit test assertion.  This is because as the story unfolds, as told in the video, assertion tests 1 though 9 are all created without first creating the algorithm.  There is no algorithm other than what emerges as a result of incrementally adjusting code to pass tests. These tests drive development of a solution to the requirements of each test.  Test 10  (line 18) fits the definition of a conventional unit test.  The algorithm code already exists and works before this last test is written, and this test never exists as a failing test.

In fact it could be argued that all of the first 9 assertions are no longer required once test #10 is added.  It could be argued that at least the first test helps at least with documentation.  Perhaps even the first and second test add to explanation of the code, but clearly having an assertion test for every value from 1 through 9 is somewhat redundant.

On the other extreme, test cases such as factoring 0 (zero), or negative numbers, are not considered.  Sufficient tests to drive the development does not automatically ensure a full set of tests for all case, and can result in some tests not really required once the development is complete.

Unit Test Without TDD Example

TDD or not, there is a important rule that the test should be in place before the code to be tested is in place, which enable verification that can fail,  but that requirement does make the test drive the solution.  In fact, if the solution is obvious, the solution will drive the test.

Clearly, at least by the time of example video,  Uncle Bob actually knew in advance how to code to solution to prime factors. If you are Uncle Bob and already know how to code the solution, why not move directly to test #10?  The advantage of using tests to drive development, is that you can built up to the solution by adding new tests cases.  while having certainty that previous functionality still works.  A solution can be developed step by step, with the increasing set of tests providing certainly every previous step is not being broken.  But what is the point of those steps if you already know the complete solution?   In that case, why not just  create a tests that validate the overall solution.

If you have an algorithm at the outset, then you could move directly to test number 10  factorsOf(2x2x3x3x5x7x11x11x13) and bypass all the simplistic tests 1 through 9, that test cases so simple that if any of those simple cases failed, test 10 would fail anyway.

Benefits and Limitations of TDD.


The promise of TDD is that the problem can be reduced to the simplest solution that passes the required tests, and allowing a simple solution.  When a complete solution seems challenging, instead of being locked out by the design challenge, development can commence immediately and build the solution piece by piece.  In the Uncle Bob example, a solution to factorsOf()arises from the tests without any formal design process.  In the late 90s, when Kent Beck and others first developed TDD this seemed like magic.  Not only did solutions arise without a formal design, process, they say that elegant solutions could arise as from testing. It seemed all solutions could be provided this way, something which most proponents (including Uncle Bob as discussed below)  have since come to realise is not true.  Design driven from tests can solve problems not solved otherwise, but it simply is not an optimum solution, or even a solution, for every problem


camel-is-a-horse-with-drop-shadowThere is a famous quotation  ‘a camel is a horse designed by a committee’. The implication being when design tasks are split, an elegant overall design can be missed.  Consider the factorisation function called with 101:   factorsOf(101)

The main loop will test if every number from 11 through 100 is a factor of 101, when once 11 (where 11×11 > 101) is reached, it is already clear the number is prime.  No number between 11 and 100 need be tested.  Perhaps development driven by tests would never discover this inefficiency?

Balancing Benefits and Limitations.

A solution arrived at through tests will not always be better than a solution planned by studying the overall problem.  The best approach is to consider both methods and compare solutions.  Driving to a solution through tests can breakthrough when no overall solution is clear, but in the end very few software projects are as simple overall as the factorsOf example.  Most often it is only parts of the solution that will have an immediate clear solution.

Solutions where possible should start with an architecture, but as code is built and tested the results allow for redefining the architecture.

In some ways, the only difference between may be immediately apparent solution and the solution driven by steps is the size of the steps a problem.  The factorsOf() project could actually be tacked as a single step, with a single test to be passed.  But if the solution is not apparent, then break it into steps and incrementally add tests.

Most software projects are more significant than ‘factorsOf” and are too large to be developed in one step before testing.  They should be broken into steps, but should those steps be broken into smaller steps?

The balance between driving to a solution with staged tests and simply testing for the end result comes down to choosing the right sized steps to tackle as a single step.

The full original TDD has its place, but a more balance development process should be taken overall.

The Three Rules of ‘TDD’?

Newton created three laws of motion.  There are three laws of thermodynamics.  Hey, even Isaac Asimov got to write three laws, so why not Uncle Bob?  Note there questions on to what definition of TDD these three rules apply. But in the case of both thermodynamics and Isaac Asimov, later review resulted in a more fundamental ‘zeroth’ law, so perhaps some review of Uncle Bobs laws is also acceptable?  Uncle Bob compares his laws to procedures that surgeons treats a ‘law’.  Although failure to follow the pre-surgery procedures suggest a surgeon is unprofessional, it should also be considered the following the procedures does not ensure a surgeon is a good surgeon. Following the laws for TDD alone will not ensure code is quality TDD code.

1. No production code without a failing test.

Recall that a test is a tangible specification, and at least at one level, this law should seem axiomatic. It could be translated as ‘have some specification of what you are going to code before you code, and you should not bother coding if the specification is already met’.

For example, if you set out to write a program that prints the national flag. Your test might be ‘when i run it, what it prints should look like the national flag’.  The test is very subjective, and could be considered an ad-hoc test, and it is very hard to automate, but it is a test.  There should always be a test before you write any code.

It is very important that the test is a unit test. However, in the rare cases a unit test is not practical, having a test that is as concrete as possible is still essential. The clear the specification..  A project can be started without a concrete overall specification, but at the very least each stage should be specified before that stage is commenced.  The specification, and hence the test, can still have flexibility.  But how flexible and deciding what test(s) to  apply is critical.

I suggest this law is essential to any software development. No production code without a failing test, and unless there is a very sound reason why it is impractical, that test should be a unit test.

2. Apply tests one at a time, in the smallest increments possible

I have changed this ‘law’, and in fact still do not regard it as a clear ‘law’, but more of a goal.   The goal is hard to word with the precision required for a ‘law’, and it is more difficult to determine when it is being broken or followed. The original wording from Uncle Bob: You are not allowed to write any more of a unit test than is sufficient to fail, and compilation tests are failures.  has two problems.  1) it is open to reading as making mandatory the  very part of the original Kent Beck definition of TDD  Uncle Bob is on the record as saying is ‘horseshit’ (more on this later on this page),  secondly the wording is open to different interpretation.

The original Kent Beck  definition of TDD would require strict adherence to tests driving all development- including design. The code to meet test number ‘n’  for a system (test=specification) must be in place prior to writing test number ‘n+1′  ( the next specification).   Strictly adhering to this principle would mean if someone says to you, “I want a new program, and it must do these three things…” you would stop them and say… “No, wait, I can only record one specification detail at a time!  Wait until the code is in place for the first thing, before considering any further functionality!”.   More normal convention would suggest that if it is planned that there are three things the program should do, surely what those three things are can be written down.  If you have good tools, the bet way to record those three ‘things’ or specifications is to record what they are as tests.  They those tests can still be activated one at a time, and that is what should be done.  Appropriate TDD is to activate tests on the code incrementally one at a time, but actually recording them ahead of time should not be banned.  It is sill possible to amend the specifications/tests as the system develops, without banning writing down suggested specifications/tests ahead of time in any form… either as code or as any other language form.

The second problem of the ‘law’ is that words are open to interpretation. What exactly is sufficient to fail?  Perhaps ‘sufficient to be used as a failing test’ makes more sense?  And what does ‘write’ mean?  If a future test occurs to you ahead of time, you should never write it down? In practice, there should be some way of recording that tests are not to be applied yet, even if it means commenting them out or preferably marking them as ‘future’ or some agreed notation.  With the factorsOf() example as explained and coded in the video, one assert at a time makes sense.  But if you know the solution, in which case there are too many asserts in the example, then adding all asserts you do need before adding code that should pass all asserts immediately simply makes sense. In fact, in the example, the last assert could be interpreted as several tests in one…..but it is still practical.

3. One there is code passes tests, do not progress before considering tests for other condition for the code just added.

Ok, this is not what Uncle Bob said in his laws (although it is followed in his example).  It could be claimed that this is about sound unit tests rather than under the heading TDD, but different people have different interpretations of terminology.

Uncle Bobs third law is stated as You are not allowed to write any more production code than is necessary to pass the one failing test.  This to me is simply restating the first law. Don’t write production code without a failing test.  Once the test is passes, then you no longer have a failing test.  This rule describes what you should not do once production code passes tests ….  but rather than a reminder of law 1, perhaps consider what you should do once production code passes tests.  What you should do is think of other tests that are need for that code.  In the factorsOf() example, Uncle Bob adds his final test, exactly as described here.  What other tests are needed?  In this case the factorsOf(2x2x3x3x5…)  test is added.  This test never fails, shows Uncle Bob actually follows this amended third law.

The Confusion: Is TDD Dead?

At least three interpretations of the term ‘TDD’ are in use, including :

  1. The Original Kent Beck Full Concept of Using Tests to Drive Development (including design)
  2. Never Code without a failing test
  3. Any Use of Unit Tests is TDD

With such variation of meaning confusion sets in.  One expert, who is using definition number 2, declares “any development not using TDD is unprofessional”.  Then another expert, hearing the statement but themselves using definition #1 responds “TDD has some uses, but more elegant designs can result from not using TDD”.  Then a third, non expert, hears that second statement, but connects the statement with definition #3 and declares “experts declare that Unit Tests block the writing of quality software”.

You can see this play out over and over on the internet. You will people claiming TDD is essential and others claiming TDD is dead….. without the posters  ever checking what exactly either those they are debating with our their sources actually specifically mean by TDD.

Here is Uncle Bob declaring that a key original idea of TDD is ‘horseshit’ .  Promoting a new definition to TDD has the problem as pointed out Jim Coplien, is that people will find the original definition from the books and talks defining the topic, and believe that original idea is what they are being instructed to do.

Is TDD dead?

One of the original ideas within the original definition of TDD, that building all system architecture from tests will always product the best solution,  is indeed dead.  Nothing else about the original TDD idea is dead.  Unit tests are not dead, and build tests before coding is certainly not dead.  Requiring all design to originate from tests  is the only part of TDD that is dead.  Building architecture from tests is also NOT dead,  but it now recognised that it will often not build the best architecture and is just one alternative, no longer a mandate.   It has since be realised that traditional system design still makes sense, and is still needed.  TDD is usually now redefined not included that one dead idea, and as such TDD is not dead, just the one idea that went too far.  In fact TDD is redefined to mean many different things. Redefining TDD as something new, like TDD=Unit tests, and then declaring this redefined TDD is dead is just confusing.

I have even seen more than one debate, as with the  example already quoted from, where the against-TDD speaker effectively concedes that TDD as defined by the pro-TDD  speaker does make sense, and it that one specific part of the original definition that is dangerous.   Arguments for and against TDD tend to be arise from different interpretation of  just what TDD actually means, and what definition different people are using.


Different definitions of what TDD means are in circulation. Before considering any point of view on TDD, it is advisable to check how the source of the opinion is interpreting the term TDD.  The originators of TDD did get ‘carried’ away with the capabilities which are very useful, but those original ideas should not be into laws.

Code should only be written with a test first identified, and unless there is a very good reason otherwise, that test should be a unit test.

Driving Development by Tests is useful, especially for specific detailed problems, but is not a practice that provides all the answers and may not answer the big picture of what is required.

In all cases, productions code should only be written with a test first identified, and unless there is a good reason why not, that test should be a unit test.

Neither full TDD, nor writing code only to failing tests,  will automatically result in a full Unit Test suite.

Python dataclasses: A revolution

Python data classes are a new feature in Python 3.7, which is currently in Beta 4 and scheduled for final release in June 2018.  However, a simple pip install brings a backport to Python 3.6.  The name dataclass sounds like this feature is specifically for classes with data and no methods, but in reality, just as with the Kotlin dataclass, the Python dataclass is for all classes and more.

I suggest that the introduction of the dataclass will transform the Python language, and in fact signal a more significant change than the move from Python 2 to Python 3.

The Problems Addressed By Dataclasses

There are two negatives with Python 3.6 classes.

‘hidden’ Class Instance Variables

The style rules for Python suggest that all instance variables should be initialised (assigned to some value) in the class __init__() method.  This at least allows scanning __init__ to reverse engineer a list of class instance variables. Surely an explicit declaration of the instance variables is preferable?

A tool such as PyCharm scans the class __init__() method to find all assignments, and then any reference to an object variable that was not found in the __init__() method is flagged as an error.

However, having to discover the instance variables for a class by scanning for assignments in the __init__() method is a poor substitute for scanning a more explicit declaration.  The body of the __init__() method essentially becomes part of the class statement.

The dataclass provides for much cleaner class declarations.

Class Overhead for Simple Classes

Creating basic classes in Python is too tedious. One solution is for programmers to use dictionaries as classes – but this is bad programming practice. Other solutions include namedtuples, the Struct class from the ObjDict package or the attrs package. With this number of different solutions, it is clear that people are looking for a solution.

The dataclass provides a cleaner, and arguably more powerful syntax than any of those alternatives, and provides the stated Python goal of one single clear solution.

Dataclass Syntax Example

Below is an arbitrary class to implement an XY coordinate and provide addition and a __repr__ to allow simple printing.

class XYPoint:

    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __add__(self, other):
        new = XYPoint(self.x, self.y)
        new.x += other.x
        new.y += other.y
        return new

    def __repr__(self):

Now the same functionality using a data class:

class XYPoint:

    def __add__(self, other):
        new = XYPoint(self.x, self. y)
        new.x += other.x
        new.y += other.y

The dataclass is automatically provided with an __init__() method and a __repr__() method.

The class declaration now has the instance variables declared at the top of the class more explicitly.

The type of ‘x’ and ‘y’ are declared as float above, although to exactly match the previous example, they should be of type Any but float may be more precise, and more clearly illustrates that annotation is usually a type.

The dataclass merely requires the class variables to be annotated using variable annotation.  ‘Any‘ provides a type completely open to any type, as is traditional with Python.  In fact, the type is currently just for documentation and is not type checked, so you could state ‘str‘ and supply an ‘int‘, and no warning or error is raised.

As you can see from the example, dataclass does not mean the class implemented is a data only class, but rather the class contains some data which almost all classes do. There are code savings for simple classes, mostly around the __init__ and __repr__ and other simple operations with the data, but the cleaner declaration syntax could be considered the main benefit and is useful for any class.

When to Use Dataclasses

The Candidates

The most significant candidates are any Python class and any dictionary used in place of a class.

Other examples are namedtuples and Struct classes from the ObjDict package.

Code using the attrs package can migrate to the more straightforward dataclass which has improved syntax at the expense of losing attrs compatibility with older Python versions.

Performance Considerations

There is no significant change to performance by using a dataclass in place of a regular Python class. A namedtuple could be slightly faster for some uses of immutable, ‘method free’ objects, but for all practical purposes, a dataclass introduces no performance overhead, and although there may be a reduction in code, this is insignificant. This video shows the results of actual performance tests.

Compatibility Limitations?

The primary compatibility constraint is that dataclasses require Python 3.6 or higher. With Python 3.6 being released in 2016, most potential deployments are well supported, leaving the main restriction as, not being available in Python 2.

The only other compatibility limitation applies to classes with existing type annotated class variables.

Any class which can limit support to Python 3.6+, and does not have type annotated class variables, can add the dataclass decorator without compatibility problems.

Just adding the dataclass decorator does not break anything, but without then adding data fields also, it does not bring significant new functionality either. ?? But the compatibility means adding data fields can be incremental as desired, with no step to ensure compatibility. ??

New capabilities do not guarantee delivery without code to make use of those capabilities. Unless the class is part of a library merely getting a spec upgrade, conversion to dataclasses makes most sense either when refactoring for readability or when code makes use of one or more functions made available by converting to a dataclass.

The ‘free’ Functionality

In addition to the clean syntax, the features provided automatically to dataclasses are:

  • Class methods generated automatically if not already defined
    • __init__ method code to save parameters
    • __repr__ to allow quick display of class data, e.g. for informative debugging
    • __eq__ and other comparison methods
    • __hash__ allowing a class to function as a dictionary key
  • Helper functions (see PEP for details)
    • fields() returns a tuple of the fields of the dataclass
    • asdict() returns a dictionary of the class data fields
    • astuple() returns a tuple of the dataclass fields
    • make_dataclass() as a factory method
    • replace() to generate a modified clone of a dataclass
    • is_dataclass
  • New Standardized Metadata
    • more information in a standard form for new methods and classes

Dataclass Full Syntax & Implementation

How Dataclasses Work

dataclass is based on the dataclass decorator. This decorator inspects the class, generates relevant metadata, then adds the required methods to the class.

The first step is to scan the class __annotations__ data. The __annotations__ data has an entry for each class level variable provided with an annotation.  Since variable annotations only appeared in Python 3.6, and class level variables are not common, there is no significant amount of legacy code with annotated class level variables.

This list is scanned for actual values of these class level variables which are of the type field. Values of type field can contain additional data for building the metadata which is stored in __dataclass_fields__ and  __dataclass_params__. Once these two metadata dictionaries are built, the standard methods are then added if they are not already present in the class. Note while an __init__() method blocks the very desirable boilerplate removing automatic __init__ method, simply renaming __init__ to __post_init__ allows retaining any code desired in an __init__, and removing the distracting boilerplate.

This process means that any class level variables that are not decorated are ignored by the dataclass decorator and not impacted by the move to a data class.

Converting Class

Consider the previous example, which was very simple. Real classes have default __init__ parameters, instance variables that are not passed to __init__, and code that will not be replaced with the automatic __init__. Here is a slightly more complicated contrived example to cover those complications with a straightforward use case.

This example adds a class level variable, last_serial_no, just to have an example of a working, class level variable, which allows a counter of each instance of the class.

Also added is serial_no which holds a serial number for each instance of the class.  Although it makes more sense to always increment the serial number by 1, an optional __init__ parameter allows incrementing by another value, showing how to deal with __init__ parameters which cannot be processed by the default __init__ method.

class XYPoint:

    last_serial_no = 0

    def __init__(self, x, y=0, skip=1):
        self.x = x
        self.y = 0
        self.serial_no = self.__class__.last_serial_no + skip
        self.__class__.last_serial_no = self.serial_no

    def __add__(self, other):
        new = XYPoint(self.x, self. y)
        new.x += other.x
        new.y += other.y
        return new

    def __repr__(self):

Now the same functionality using a dataclass.

from dataclasses import dataclass, field, InitVar

class XYPoint:
    last_serial_no = 0
    x: float
    y: float = 0
    skip: InitVar[int] = 1
    serial_no: int = field(init=False)

    def __post_init__(self, skip):
        self.serial_no = self.last_serial_no + self.skip
        self.__class__.last_serial_no = self.serial_no

    def __add__(self, other):
        new = XYPoint(self.x, self. y)
        new.x += other.x
        new.y += other.y

The class level variable without annotation needs no change. The __init__ parameter that is not also an instance variable has the InitVar type wrapper. This ensures it is passed through to __post_init__  which provides all __init__ logic that is not automatic.

The serial number is an instance variable or field that is not in the init, and to change default settings for a field, just assign a value to the field (which can still include a default value as a parameter to the field).

I think this example covers every realistic use requirement to convert any existing class.

Types and Python

Dataclasses are based on usage of annotations. As noted in annotations, there is no requirement that annotations be types. The reason for providing annotations was primarily driven by the need to allow for third-party type hinting.

Dataclasses do give the first use of annotations (and by implication, potentially types) in the Python standard libraries.

Annotating with None or docstrings is possible. There are many in the Python community adamant that types will never be required, nor become the convention. I do see optional types slowly creeping in though.

Issues and Considerations

It is possible there are some issues with existing classes which use class level variables and instance variables, but none have been found so far, which this leaves this section as mostly ‘to be added’ (check back).


There is a strong case that all classes, as well as namedtuples, and even other data not currently implemented as a class and some other constructs better implemented as classes, should move to dataclasses. For small classes, and all classes start as small classes, there is the advantage of saving some boilerplate code. Reducing boilerplate code makes it easier to maintain, and ultimately more readable.

Ultimately, the main benefit is any class written using a dataclass is more readable and maintainable than without dataclasses. Converting existing classes is as simple as renaming.

Python Annotations

Python syntax allows for two flavours of annotations:


Both flavours of annotation work in the same manner.

They build a dictionary called __annotations__ which stores the list of annotations for a function, a module or a class.

It is common practice to annotate with types, such as int or str, but Python language implementation allows any valid expression. Using types can make Python look like other languages which have similar syntax and require types, and is one of the motivations for annotations in Python. Third-party tools may report expressions which are not types as errors, but Python itself currently allows any expression.

The Python dataclass now makes use of annotations.  Outside of this use, if you are not using an external type validator like mypy there seems little incentive to bother with type annotations, but  if you are going to document what type a variable should be, then annotation is the optimum solution.

The following code illustrates variable annotation at class level, module level, and local level:

>>> class Foo:

    class_var1: "Annotation"
    class_var2: "Another" + " annotation" = 3
    class_int: int

    def func(self):
        local1 : int
        local2 : undeclared_var
        self.variable : undeclared * 2 = 7
>>> module_var: Foo = Foo()

>>> module_var2: 3*2 = 7
>>> module_var3: "another" + " " + "one"
>>> Foo.__annotations__
{'class_var1': 'Annotation', 'class_var2': 'Another annotation',
'class_int':  }
>>> __annotations__
{ 'module_var': , 'module_var2': 6, 'module_var3': 'another one'}
>>> f = Foo()
>>> f.func()
>>> f.variable

Class Variables: The code annotates 3 identifiers at Foo scope (these identifiers, and the annotations all then appear in Foo.__annotations__. Note that only class_var2 is actually a variable and will appear in a dir() for Foo.  class_var1 and class_int appear in __annotations__ but are not actually created as variables.

Module Variables: Three module_var variables annotated at the module level, and all appear in __annotations__, and again module_var3 does not appear in globals as annotation itself does not actually create the variable, it solely creates the entry in __annotations__.  (module_var and module_var2 are assigned values, so are actual variables).

Local & Instance Variables: The func within the Foo class illustrates two local annotations, one of which uses an undeclared_var. This use of an undeclared identifier would generate an error with either class or module variables, in which case the expression is evaluated for the relevant __annotations__ dictionary. The expressions for local and instance variables annotations are not evaluated. At this stage, I have not found where, or how, the annotation data is stored.

The PEP for variable annotations is available here. Note the stated goal is to enable third party type checking utilities, even though the implementation does not restrict annotations to types. The non-goals are also very interesting.

While the practice of using annotations only with valid types might be best practice, it is worth understanding the compiler does not require this.

Function annotations: Introduced in Python 3.0 (2008)

Here is an example of function annotation:

>>> def func(p1: int, p2: "this is also an int" + " but ...") -> float:
	return p1 + p2

>>> func.__annotations__
{'p1': , 'p2': 'this is also an int but ...', 'return':  }

The expression following the ‘:‘ (colon) character (or the ‘->' symbol) is evaluated and the result stored in the __annotations__ dictionary.

The PEP is available here, but it is the fundamentals section that is the most highly recommended reading.

What is a DSL?

With Kotlin the term ‘Kotlin DSL’ usually refers to DSLs built in Kotlin using specific Kotlin features (as discussed in ‘Kotlin DSLs’), this page, however, is about DSLs in general.

  • Introduction:
    • DSL: The general definition
    • DSL vs ‘a DSL’
  • The Types of DSL:
    1. External DSL
    2. Internal Detached DSL
    3. Internal Augmentation DSL
  • Detached vs Internal DSL: A continuity?
  • Language Augmentation in general Vs An Augmentation DSL
  • Conclusion: DSL types can be quite different


DSL: The general definition

The acronym ‘DSL’ stands for ‘Domain Specific Language’. A ‘Domain’ being effectively a particular application or field of expertise. ‘Specific’ is self-explanatory, but what exactly is meant by ‘language’ does warrant further exploration later.

Contrasting with ‘general purpose languages’ which attempt to allow for solving any programming problem, a DSL can be purpose designed for a specific ‘domain’ or a specific type of problem.

The term DSL is a broad term, covering some different types of DSLs.  Sometimes people use the term DSL when they are referring to a specific type of DSL, resulting in the term appearing to mean different things in different contexts.

Martin Fowler (who has written books on DSLs that can be very worthwhile reading) described two different main types of DSL, External and Internal, which differ by how they are implemented.  Next, Martin Fowler explains that the second Implementation type, Internal, itself provides two types of DSL, the Internal Mini-language and the Internal Language Extension. This results in a total of three different types of DSL.

DSL vs a DSL

There is a sematic difference between ‘language’ and ‘a language’.  Consider the two phrases “he likes to use language which is considered antiquated’ and “he likes to use a language which is considered antiquated”.  The first suggests vocabulary within a language e.g. antiquated words within the English language, the second suggests use of a language such as ancient Greek or Latin.

Similarly. ‘domain specific language’ can be though of a terms within a language which are specific to a particular domain’ while ‘a domain specific language’ suggests an entirely new language developed for use in a specific domain.

The Types of DSL: External,  Detached & Augmentation

DSLs come in two main forms: external and internal. An external DSL is a language that is parsed independently of the host general purpose language: good examples include regular expressions and CSS: Martin fowler.

These are DSLs like SQL, or HTML. Languages only applicable within a specific domain (such a databases, or web pages) which are stand-alone languages, but with functionality focused on that specific field or domain, and too limited to be used as a general purpose language.  Implementing a DSL as an external DSLs enables the DSL to be unrelated to the programming language used to write the DSL.

Externals DSLs generally have the same goal as a Detached DSL, but built using a different implementation method.

The key advantage for external DSLs is that by being independent of any base language, they work unchanged with any general language.  So SQL is the same DSL when working with Java, Python, Kotlin or C#.

The first problem with independent DSLs is that the task written using the DSL often also need some general purpose language functionality. So the task will then be written in two languages.  A general purpose language for part of the solution, and a DSL for another part.  The project requires two different languages.

The second problem with independent DSLs is that the features of the general purpose language are not accessible from within the DSL. This means the DSL may need to duplicate features already available in any general purpose languages. Such duplicated features are generally inferior to those in general purpose languages.  E.g. numeric expressions in SQL are not as powerful as most general purpose languages, and there is often a syntax change from the general purpose language.

2. Internal Detached DSLs

When people talk about internal DSLs I see two styles: internal mini-languages and language enhancements.

An internal minilanguage is really using an internal DSL to do the same thing as you would with an external DSL.  Source: Martin Fowler.

Unlike an external DSL, you are limited by the syntax and programming model of your host language, but you do not need to bother with building a parser. You are also able to use the host language features in complicated cases should you need to.

Martin Fowler

Under Martin Fowlers definition, a detached DSL is the first of two types of Internal DSL.  These Internal Detached DSLs, like External DSLs,  are building their own ‘mini-language’ for a specific domain.  Detached DSLS are building ‘a domain specific language‘ as opposed to ‘domain specific language’ vocabulary for an existing language.  With a Detached DSLs, the new stand-alone language is created within an existing language. To achieve being a standalone language,  the DSLs needs to be separated or ‘detached’ from the host language.  Even if such a language is ‘fully-detached’ from the host language, it is will normally be the case that some host language syntax is available from within the DSL.  In all cases, the rules and syntax of the DSL will be shaped by what can be built within the framework of the host language.

This Detached DSL is the type of DSL usually referred to in the discussion of Kotlin DSLs, and of Gradle build files are an example of a Groovy Internal, Detached DSL.

As the goals are the same as External DSLs in creating what can be seen as a standalone language, these DSLs ideally require little understanding of the host language by those using the DSL.  So build.gradle files require, at least in theory, almost no understanding of the Groovy language, or perhaps more realistically, an understanding of only a tiny subset of the host language.  Kotlinx.html is a Kotlin example of this type of DSL built within Kotlin, and the actual Kolinx.html syntax can seem very different to regular Kotlin syntax, even though all code is actually Kotlin.

3: Internal Augmentation DSL.

The alternative way of using internal DSLs is quite different to anything you might do with an external DSL. This is where you are using DSL techniques to enhance the host language. A good example of this is many of the facilities of Ruby on Rails.  Martin Fowler.

Why build a complete language if you can just add features to an existing language?  This third type of DSL no longer has the goal of creating a standalone language. It is ‘domain specific language’ more as a parallel to a set of jargon words for a specific domain can be used in a conversation that is based in English.  The jargon provides new language, but the conversation overall is still in English.   To understand the conversation, you need to know English as well as the specific jargon.  Code using an augmentation DSL will still also make use of the host language.   The program is still seen as in the original language, but using some additional definitions specific to the augmentation DSL. The goal of the augmentation DLS is to add new vocabulary or capability to an existing language, and this makes Augmentation DSLs quite different to the previous DSL types. Instead of an entire stand alone new language, the result is an extension or augmentation to an existing ‘host’ language.  Effectively extending the power of the original host language to have new vocabulary and perhaps also new grammar. This enables the simple and concise expression of ideas and concepts from a specific domain while continuing to use the host language. The augmentation is to be used in combination with the power and flexibility of the host language, which allows for more general areas of a programming in combination with programming for the specialist domain.

Such augmentations still require users to know the host language, but  provide a more homogenous solution than the combination of a stand-alone language with a general purpose language.   For example, while a Python program can build SQL commands to send directly to an SQL database server, an augmentation to python such as SQLAlchemy allow the same power as the SQL language, all within the general syntax of Python.

Detached vs Augmentation DSLs: A continuity?

Both Detached DSLs and Augmentation DSLs are build inside an existing language, and the same set of language features can be used to build either type of DSL.   It is only the goal that is different.  Build a syntax that feels detached from the host language,  or build a syntax that integrates with the host language.

The reality is not every detached DSL is fully detached from the host language, and many do require knowing the host language.

There is a clear test for a fully Detached DSL:  If the DSL can be read, or written, by people with knowledge only of the DSL without needing knowledge of the host language, then it is a fully detached language. Gradle Build files are an example of a internal detached DSL that passes this test, as you can write build files without knowing  the host language (which can be either Groovy or Kotlin).

However,  just because the DSL syntax can be used fully detached from the host language, does not mean actual code in the DSL always will be fully detached from the host language.   For example, Gradle build files can make use of the host language syntax within the build file, and when that host syntax is used, the result is  a build file that does require a knowledge of the host language (which can actually be either Groovy or Kotlin). So for some code, even with a DSL capable of fully detached use,  working with that code will require knowledge of the host language.

Fully detached code can be designed to be  possible, but with the host language syntax available, it cannot be guaranteed all code will be fully detached.

Further, in practice many examples seek to be only partially detached from the host language.  In fact our own example all fit this pattern, as the semi-detached code actually exists interspersed with Kotlin code and there is no goal to enable code be read without knowing Kotlin.

Martin Fowler quotes the examples of the Rake DSL as being able to be categorised as either an independent language or an extension, which in my terminology would suggest it is more to the centre of the continuum.

When we use the term ‘Kotlin DSL’ or even ‘Python DSL’, we mean a DSL made by augmenting Kotlin or Python with both extra ‘vocabulary’ for domain specific features, are rarely.  The DSL is a set of new language constructs which extends an existing language.

Technically, this is always an extended language, but if the goal is to allow the use of these extensions by themselves you have independent language DSL, and if the goal is to allow programs in the host language access to new additional syntax, you have a Language Extensions DSL

An Augmentation DSL vs Language Augmentation

As discussed in languages, all but the simplest human communication makes use of language augmentation, and all but the simplest programs defines variables, functions and other elements that then become part of the language used elsewhere in the program.  An augmentation DSL is created when a specific block of language augmentation (definitions of variables, functions classes or even syntax) is separated from any specific application using that augmentation, and is provided for the use of any application which may require the same functionality.

Conclusion: DSL types can be quite different.

The Rake DSL(Detached/Augmentation hybrid DSL), or Gradle(Detached DSL) or HTML(External DSL):  these are all greatly different examples that all can be called DSL.

When the term DSL is used, it can refer to DSLs in general, but more often one of three entirely different types can be being discussed, and being discussed as if all DSLs are of that type, which can be confusing if you are often dealing with one of the other DSL types.  The term DSL is an extension of the language programming jargon, but perhaps it would be useful to have three additional terms, (making a four-word language extension) with an agreed adjective for each of the three types of DSL.