Solving complex problems in software development – part 2/6 – A Walk in the Park…

Yes, you are on the right track. How in earth can this be a Walk in the Park, when it is a totally new business, a new sub-domain within the domain of trains?

And of course it cannot, since the differences in the interface towards the train hardware, between an ATP and an ATO system, are vital.

The input and output signals to the ATP software are very close to digital and fail-safe as well, with very accurate time stamped speed and balise detection. But, many of the input and output signals to the ATO software are of analogue character. This means that the inputs to the ATO like speed and odometer updates are not exact and the exact output from the ATO to the brakes and propelling, can not be handled exact by the train hardware. Add also to this the reaction times from the ATO output, for example the brake, until the train actually is braking with that value, or the approximation of the value when the train hardware does not have enough resolution, or the hardware actually changes over time, by for example the wheels and brakes that are continuously worn.

So, how the history of the company’s former business with the fail-safe ATP and traction systems affects the implementation of the ATO system, can be divided into three parts:

  • the traction hardware and test equipment hardware only require small adaptions, actually only some new cables.
  • the new ATO software is done by ATP programmers that are used to fill the gap in a specification, and do the ATO software in the same way
  • the test equipment’s software follows the same pattern, and only the knowledge about trains, necessary for making an ATP system, are considered.

The knowledge about the inexactness of a train hardware is not perceived and therefore not elaborated on by the team. The consequence is that the input and the output signals are handled like they were digital and exact. The team does not know that they are in the land of the unknown unknowns. And they can impossibly know that with their history, because, how can they understand that they not understand. And if we do not understand what interface our product has, we can neither write a good test system.

So, this is the perception of the work to be performed according to the company and the ATO project.

In the middle 1990s, a young engineer starts his first work, at this company, after finishing technical university, where he focused on computer technics. He starts in the, since over two years on-going, ATO project. At that time, there is mainly verification left with two different train hardware suppliers and the test on site and later acceptance test with the customer.

During his first year the project is drained even more of resources, since the project has low status, and is not regarded as high-tech enough. The reason is that the employees rather work with the fail-safe signalling software with A/B programming (where of course A and B can be seen as safe-to-fail, even though the meaning is to make the system fail-safe to an even higher degree). Of some lucky reason our engineer has studied a course with a content that he can actually use directly at his first work, a similar real-time programming language as used in the ATO. This makes it easy for our engineer to learn the ATO software by testing it in the train and track simulator, note the bugs and then fix the bugs himself.

Does our new engineer understand what is happening? No. He cannot help us either, even though he is a good programmer and corrects many bugs and is focused on learning the software. Since he does not know anything about trains, he cannot judge either. And as long as no one knows anything about trains, the ATO project will be in the unknown unknowns.

All test results home in the lab show that everything is perfect (the train always stopped within the limits, -34 cm short in the simulator) and that we are in the Obvious domain in the Cynefin™ framework. It is clear that everything points on that the testing with trains and all surrounding equipment at the customer will be a Walk in the Park.

When we think we have it all figured out, we believe that we are strolling in the beautiful park, and it is hard to think differently. But, we already know what happens when we believe that we are in the ordered domains in the Cynefinframework, but we really are in the Complex domain. Reality will kick our as, right? That will be handled in the next blog post. C u then.

Leave a Reply