Foundations of ISA – Base paradigm – Scale ISA global
Iteration 3: Useless concepts
The Object Orientation underlying paradigm is now well established, in particular thru guidelines to follow. Before making our basic set of requirements, having, in other words, built the white list. And before of course begin to explore the ISA's solution space, I would like to make some black list: The list of all the things, of the concepts that ISA does not need, and even more of those that should absolutly be avoided.
This is the purpose of this post: eliminate the useless concepts that pollute the real IT world nowadays.
What is a bag? All the things that are so general that any structure cannot be associated to it.
Why this is wrong in ISA? Because you cannot assume anything on the things you manipulate and that makes the architecture not uniform and predictable. Nothing you can rely, like some basic types and behaviors, like technical representation, the size, the scalability, etc. That's the situation we experiment every days.
To sum up, because it at the opposite to the “everything is an object” approach.
All the things you can assimilate to a bag or a heap must be avoided. Here are some representative examples:
- File and Directory
- Resource (REST)
- Void*
- Socket
- Stream of bytes
- Memory dump
The most basic manipulated thing is an Object. This gives several properties always available, like the features of elementary Classes.
To support generically anything, even an unknown thing, all ISA objects inherit from the Object root class.
The memory dump is a particulary vicious way to access private data. It allows to access the passwords or to their hash code, even when they are not stored. So, having this kind of dumper software as a basic tool of the Operating System should be prohibited. |
This means that there is no file system in ISA.
Therefore, the notion of generic entry point, like a static main function, doesn't make sense in our context.
For similar reason to the rejection of the main function above, the notion of process has to be questioned.
What is a Process?
- A set of instruction to execute (a program) loaded from one executable file.
- An addressable memory space.
- A set of accessible resources.
- A set of attributes (security, priority, etc.).
- The state of processor registers.
Is that an Object? A business Object? A functionality? If yes, its extend goes from a quite small and mono-function one (like a shell command), to ones with a very large scope (like MS Word). This is troublesome.
The Process scope in its whole is not well defined. This is because historically the process comes from the need to manage execution parallelism: the multi-processing was invented to solve this.
Originally, the programs just consist of sequences of instructions for processors. They have nothing to do with Object Orientation. For instance, the access to shared Object (like libraries, peripherals, search engine, user, security features, etc.) is not done in a uniform way.
So, as a file is a bag of data, is a process just a bag for treatments? Or can it be considered to be an Object? It is true that it follows all the oo Axioms. However, it does it in a particular manner. If it is an Object, it is a technical one.
Does ISA require this kind of technical object ?
At first glance, it seems preferable to let the Objects interact freely between them. They share the same basement. :
- The concurrency question is addressed by the policy “one ‘thread' per Object”.
- The entry point has been seen above (the main function of a process) . More generally the Use Cases could inherit of some of the properties of the actual Processes.
- The security question must be addressed for all Objects, in that base class “Object”. This is the only way to make the software sure from the foundation to the roof: That all building blocks are secured in the same way.
- Remain the administration question. What is operated in a process? CPU, I/O, memory usages. These quantities can be managed at the Object level, at the Use Case level. And this certainly much more accurately because the grain is finer and its scope precisely defined.
So, the primary option is to state that Process notion is not relevant for ISA. However, the consequences of this option must be explored in depth. Rollbacking this decision will remain possible.
In that 2009 reference, Tony Hoare describes his invention as a “billion-dollar mistake”:
I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare – Tony Hoare – 2009
What Is The Worst Mistake Ever Made In Computer Programming? – Anders Kaseorg – 2013
What are the general questions behind the “null”: of course the multiplicity, but also the applicability.
Multiplicity
The Multiplicity is the right concept to tell how many Objects could be in a given Place. It expresses in a single concept the range of Objects. It defines: if it is optional or mandatory, how many at least and at most.
That is for the definition. It remains the question of the current value of this multiplicity at runtime.
There is no question when the multiplicity is a single value range (1,1). The Place multiplicity is fixed in its definition in the class. Therefore, it could be checked at compile time.
If the multiplicity is a real range of integer values (2,n or 0,1), the Place multiplicity can have any value in that range defined by the class. This also can be checked at compile time for ranges when their maximum value is superior to 1, because an “indexed” access has to be done.
Several solutions can be envisaged to manage the 0 minimum value (ignore like in Objective C, always consider a set, etc.). Anyway the raw null value must be ejected from ISA.
Applicability
The multiplicity expresses the maximum set extends for Property, Attribute or Argument. It is a way to express a class invariant.
On the other hand, the applicability, expresses the range of Property, Attribute or Argument in a given state of the Object. It is a way to check the state consistency at runtime. It is expressed or used in a pre-condition and post-condition of a Command.
Example: Given a Person class with one Property Gender (Male, Female) and a Pregnant (Boolean) Attribute. The Applicability property on the Pregnant Attribute expresses the condition that Gender Value is Female.
The Applicability type is Boolean.
To sum up, the Multiplicity and Applicability concepts have to be incorporated in ISA.
If there is no runtime library for each process, this frees a significant amount of memory resource. The runtime library is not duplicated in each process. Moreover, this enforces the memory management to be done in a uniform way, at the Operating System level.
In this kind of architecture, each Object acts as a micro-process. The O.S. allocates and manages all resources. It also dispatches the messages between Objects.
Perhaps it's too early to adopt that orientation about not having runtime library. Before doing that I must have a deep study of Smalltalk, and also of the following operating systems: BeOS – Haiku, Choices, IBM AS/400, IBM TopView, Java OS, JX, MS Singularity, NeXTSTEP, OOSMOS, Syllable, Taligent.
Language level virtual machine
However, there is a particular kind of runtime libraries: the virtual machines (Java JVM, dot NET CLR) that rely on a non-native pseudocode and that manage the memory thru a garbage collector. These approaches make easier the portability and the memory management for developers. But the counterpart is a major performance penalty (about one order of magnitude) and a large resource consumption. This is incompatible with ISA wish N°6 – Modern, performant and durable.
There are alternatives as the C/C++ history shows. Portability can be addressed at the code level (see Posix). The compilation could be done at design time or at runtime on the target machine. The memory management can be done thru an “on the fly” mechanism for the allocation/free, and in a secure way in using the destructor and smart pointer binome (see modern C++). This approach is very efficient, linear in time (do not stall) and allows to minimize memory usage.
Therefore, the language level virtual machine is a concept rejected from ISA.
One of the worst thing found in the real world IT is the mixing of several languages in the same file.
I remember, years ago, having written a site with ColdFusion (CFM) with no less than 4 languages in the same file: HTML, JavaScript, CFM and SQL.
This kind of situation still exists more that 15 years later…
I'm not sure this practice can be called a “concept”…
But I'm sure that I don't want it for ISA.
A page, or even a window, is some kind of bag, either a bag of characters or of pixels. That's a weakly structured thing.
Conceptually, the corresponding Class family is a “Display”. This means something that is able to presents some thing to our eyes. A Display is of several classes: Discrete indicator (LED), Characters display, Graphical or a mix of these previous ones.
Depending of its class(es), a Display have properties like: the semantic of an indicator, number of characters, number of lines, resolution, color support, etc.
From an oo point of view, a Display Object displays Objects. Typically, these Objects are pushed on the Display and live in a space, with an orthonormal .
The Objects, or Views of these, are able to be displayed and edited in a Display. An Object may also be related to other Objects, or even incorporated into them.
This is far richer that managing a graphic bag manually, specificly and repeatedly.
So, definitely, for our Ideal Software Architecture, the manner an Object is rendered by a screen cannot be just a question of pixel, position and color.
This also introduces the questions of notions like Window or Page. More deeply, it is questioning the need for the Desktop metaphor or of the pertinence of Web Browser.
Yes, yes, you have well read. I'm deprecating the Web!
The current Web in fact, the one coming from the Information Management: A Proposal of Tim Bernes-Lee. My proposal is to a return to the original view of Tim, not just connecting, linking documents but any Objects of any kind. And then rendering them for instance on a Display object. More generally on a Renderer either visual or audio: a Renderer class per sense the humans have.
It is a current practice nowadays to serialize things using a hierarchical data format like XML or JSON. The XML (like YAML) allows intra-document reference, but this feature is rarely used and is limited to the document scope.
But Object Orientation and more generally the real world is not hierarchical at all. It is a graph where the objects interrelate in an open way, having internal and external references to other objects.
This hierarchical structure often come from the famous DTO (Data Transfert Object). You know, this kind of objects without any operation 🙁
In C language, they are just called struct
. This is closer to what they really are: “data barrels”.
In ISA, a more oo related concept must be established: view.
A view is without surprise an Object giving a view on another Object: its root. It derives from it in exposing a selection of its members (properties, attributes, operations, relationships). The View concept is one of the corner stone of ISA and will be widely discussed in next posts.
As mentioned above, for having a graph oriented serialization, it is required being able to refer to another object. These objects being either in the same serialization unit or not. This requires having a reliable identifier: the Object IDentifier (OID).
The serialization format must be both human readable and small to limit network overhead. This is quite contradictory. YAML is a good example of this kind of compromise. However, while it is less verbose that most of other formats, it is repetitive on arrays and not minimal as a separator based syntax like CSV.
To conclude, the serialization format has to be studied carefully. Perhaps having 2 formats is the best answer, perhaps using compression to decrease the stream size (rgcu_2_3_4_0.csv: 9 937bytes, rgcu_2_3_4_0.xml: 36 150 bytes, rgcu_2_3_4_0.csv.zip: 854 bytes, rgcu_2_3_4_0.xml.zip: 1 988 bytes)…