“Recreation” on binary versus textual representation

In a recent brainstorming post, I stated that in ISA having a typed system every where, using the right type at the right place, is certainly a very good policy.

In the current IT world this obviously not the case. We often need, among others, to encode binary data into text. This give several declination like hexadecimal, base 64, etc.
With ISA, to represents texts there are several types: Key, String, Text, etc. (Cf. Textual Assembly). And for binary data, there are mainly the types sbyte and Sbytes (Cf. Elements Assembly).

So software (for edition, display, etc.) always know the type and can display the appropriate editor or renderer. To display text, the subject is known (although for the Text type some deep update are required because of its aim to overcome HTML).

But for binary data, edition tools are much more poor. That's because they almost always try to treat binary data a text, what it is not !

This is why, I will try to suggest a symbolic system to represent binary data efficiently for what they are.

The principle is simple: Represent the 256 possible values of an octet in a human readable symbol. These symbols being separated by blank border as the characters are.

The proposal is to have 8 zones in the symbol. One for each bit. The top most and left most zone is for the 8th bit (heavy weight); and the bottom most and right most zone is for the 1st bit (weak weight). The reading convention being from left to right, then from top to bottom.
On a 6 by 12 matrix of points, these 8 zones are displayable, each of them being of 9 points.
The bits are represented as follow:

The extremity octets 11111111 and 00000000 have the following symbols:

"Recreation" on binary versus textual representation

And below are shown some examples:

Full example of binary data as it could appears in a binary editor with 8 bytes (1 sbyte) per line:

Pretty readable isn't it? Especially when compared with the same data expressed with hexadecimal characters…

You may notice that this representation recalls the one used in the past with the punch cards, while it was not anticipated, this is certainly not a coincidence.

One important thing is that these domino styled symbols are not characters. Consequently, from a computer science point of view, it makes no sense to have a corresponding glyph in the Unicode char set for instance.

This representation also have the property to be both machine-readable and human-readable. In other word it can replace the actual systems based on the bar codes and the QR codes. It is also be quite smaller, which is also an advantage.
1 single line of 8 symbols allows to store 1,84467E+19 values (19 decimal numeric digits). This capacity is the one needed to identify an object ID for instance.When needed, with just 3 lines, any object can be completely defined with its class id, object id and version number.

With some hardware development it can to replace passive RFID chips. In being both readable visually and with electromagnetic field. This kind of RFID can be printed with small holes that also alter the underlying electric matrix: simple, cheap and secure!Other applications to which I have not think could certainly be found (the two above were not foreseen by me at the beginning of this post).

Other applications to which I have not think could certainly be found (the two above were not foreseen by me at the beginning of this post).

To conclude this short recreation, this proposal for binary data representation is of course to discuss. So welcome to your comments!

Tags: (05 - Warm up), Uncategorized

ISA

Ideal Software Architecture Blog

“Recreation” on binary versus textual representation

Leave a Reply Cancel reply

Related posts:

Leave a Reply Cancel reply