I have always been interested in breaking down data and concepts into fundamental pieces. In a way everything can be decomposed into bits, at least in the realm of data.
And so it always appeared unnatural to me that programming languages or libraries did not naturally express this this fact that the bit is the unit of information. This is particularly unnatural when dealing with information where data that is not neatly packed into bytes. A truly compositional way of parsing should build upon bits, even if at least 8 bits or more are read from disk, memory or whereever.
So I set out to formulate this in C#. This is only the beginning but here’s what the very initial alpha functions code looks like.
First we define decomposition for bool values (bits), this is obviously the simplest case as we cannot decompose this case (well, we could, but let’s not go there)
public static IEnumerable<bool> Information(this bool x)
{
yield return x;
}
Next we decompose bytes
public static IEnumerable<bool> Information(this byte x)
{
return Enumerable.Range(0, 8 )
.Select(i => ((x >> i) & 1) == 1)
.Normalize();
}
Then we define a normalization combinator that orders bits according to the current architechture’s Endianness
public static IEnumerable<bool> Normalize(this IEnumerable<bool> s)
{
return BitConverter.IsLittleEndian ? s.Reverse() : s;
}
And so on and so forth. As you can see this is very limited so far, but you can imagine combining these to decompose byte streams from files, etc. And so this will allow you do be completely oblivious to endianness and decomposition of bytes etc. All you see are bitstreams.
As said, this is alpha. I have almost no experience with binary parsing and such, but this should be interesting for diving into that area.
