U+1F929 GRINNING FACE WITH STAR EYES
from Google’s Noto Emoji font, via ImageMagick and xbmbraille
, as rendered by the kitty terminal.
XBM❤braille
There’s not always a good reason to play with stuff, but stuff sometimes begs to be played with. In this case, Unicode’s braille block gives you a (chonky) monochrome pixel-addressable screen on the terminal. And XBM gives you a very straightforward monochrome bitmap representation, so you can play with both things with minimal friction.
Braille in Unicode
There are Braille patterns in Unicode, from ⠀
(U+2800 BRAILLE PATTERN BLANK
)
through ⣿
(U+28FF BRAILLE PATTERN DOTS-12345678
). Outside of -BLANK
, each
character is named according to which dots are present, and the dots are numbered from 1 to 8, in the
obvious way:
14
25
36
78
the name is always sorted (so it’s DOTS-167
, not DOTS-176
even though 7 is in the first column).
These beg to be used as the building blocks for a canvas, or at least a pixel-addressable screen. All you have to do is get a monochrome bitmap, and map the pixels onto the right characters.
XBM
You probably have some XBM files on your Linux desktop.
They’re text.
They look suspiciously like C code.
If you’re writing C you probably #include
them directly, embedding them like a boss.
: locate .xbm | tail -n1
/usr/share/themes/Syscrash/openbox-3/max_toggled.xbm
: !! | xargs cat
locate .xbm | tail -n1 | xargs cat
#define max_toggled_width 6
#define max_toggled_height 6
static unsigned char max_toggled_bits[] = {
0x3c, 0x27, 0x25, 0x3d, 0x11, 0x1f };
There are some slight variations (e.g. some versions don’t say unsigned
), and
some of them are designed for use as pointers so they have a hotspot, but you
get the idea.
If you’ve ever looked into XPM
files, those are even neater to hack around
with because in those you can see the image directly (if you squint a little),
but that would be wasteful for our usecase (XPMs are 8-bit paletted), so we’re
sticking with XBMs.
From that wikipedia article,
XBM image data consists of a line of pixel values stored in a static array. Because a single bit represents each pixel (0 for white or 1 for black), each byte in the array contains the information for eight pixels, with the upper left pixel in the bitmap represented by the low bit of the first byte in the array. If the image width does not match a multiple of 8, the extra bits in the last byte of each row are ignored.
That “0 for white or 1 for black” is backwards to what you might expect, but in practice it’ll depend on whether your terminal is dark-on-light or vice versa.
Pixel streams
So let’s assume we have a stream of bytes that correspond to the bitmapped pixels, horizontally. We’ll need
to look at four lines at a time, and for each quadruple of interleaved bytes we consume we should be
producing four Unicode characters (assuming the image’s width is a multiple of 8). The first two bits of
each byte quadruple determine the first Unicode character, the second two bits the second character, and so on. We’re
obviously going to need a little table to look this up, but if the four bytes coming in are a
, b
, c
,
and d
, then the table lookup will be something like
I particularly appreciate how the ability to write binary literals
makes it easier to understand what’s going on.
table[(a&0b00000011)<<0+(b&0b00000011)<<2+(c&0b00000011)<<4+(d&0b00000011)<<6],
table[(a&0b00001100)>>2+(b&0b00001100)<<0+(c&0b00001100)<<2+(d&0b00001100)<<4],
table[(a&0b00110000)>>4+(b&0b00110000)>>2+(c&0b00110000)<<0+(d&0b00110000)<<2],
table[(a&0b11000000)>>6+(b&0b11000000)>>4+(c&0b11000000)>>2+(d&0b11000000)<<0]
Building the table is straightforward if we can look characters up by name. That’s easily done in Python See go#32937 about doing this in Go. :
from unicodedata import lookup
table = []
for i in range(255):
# REMEMBER 0 is 'on'
dots = []
if (i & 0b00000001) == 0:
dots.append('1')
if (i & 0b00000010) == 0:
dots.append('4')
if (i & 0b00000100) == 0:
dots.append('2')
if (i & 0b00001000) == 0:
dots.append('5')
if (i & 0b00010000) == 0:
dots.append('3')
if (i & 0b00100000) == 0:
dots.append('6')
if (i & 0b01000000) == 0:
dots.append('7')
if (i & 0b10000000) == 0:
dots.append('8')
dots.sort()
table.append(lookup("BRAILLE PATTERN DOTS-" + "".join(dots)))
table.append(lookup("BRAILLE PATTERN BLANK"))
and you can print the table out, align it properly and then visually inspect it for correctness,
'⣿', '⣾', '⣷', '⣶', '⣽', '⣼', '⣵', '⣴', '⣯', '⣮', '⣧', '⣦', '⣭', '⣬', '⣥', '⣤',
'⣻', '⣺', '⣳', '⣲', '⣹', '⣸', '⣱', '⣰', '⣫', '⣪', '⣣', '⣢', '⣩', '⣨', '⣡', '⣠',
'⣟', '⣞', '⣗', '⣖', '⣝', '⣜', '⣕', '⣔', '⣏', '⣎', '⣇', '⣆', '⣍', '⣌', '⣅', '⣄',
'⣛', '⣚', '⣓', '⣒', '⣙', '⣘', '⣑', '⣐', '⣋', '⣊', '⣃', '⣂', '⣉', '⣈', '⣁', '⣀',
'⢿', '⢾', '⢷', '⢶', '⢽', '⢼', '⢵', '⢴', '⢯', '⢮', '⢧', '⢦', '⢭', '⢬', '⢥', '⢤',
'⢻', '⢺', '⢳', '⢲', '⢹', '⢸', '⢱', '⢰', '⢫', '⢪', '⢣', '⢢', '⢩', '⢨', '⢡', '⢠',
'⢟', '⢞', '⢗', '⢖', '⢝', '⢜', '⢕', '⢔', '⢏', '⢎', '⢇', '⢆', '⢍', '⢌', '⢅', '⢄',
'⢛', '⢚', '⢓', '⢒', '⢙', '⢘', '⢑', '⢐', '⢋', '⢊', '⢃', '⢂', '⢉', '⢈', '⢁', '⢀',
'⡿', '⡾', '⡷', '⡶', '⡽', '⡼', '⡵', '⡴', '⡯', '⡮', '⡧', '⡦', '⡭', '⡬', '⡥', '⡤',
'⡻', '⡺', '⡳', '⡲', '⡹', '⡸', '⡱', '⡰', '⡫', '⡪', '⡣', '⡢', '⡩', '⡨', '⡡', '⡠',
'⡟', '⡞', '⡗', '⡖', '⡝', '⡜', '⡕', '⡔', '⡏', '⡎', '⡇', '⡆', '⡍', '⡌', '⡅', '⡄',
'⡛', '⡚', '⡓', '⡒', '⡙', '⡘', '⡑', '⡐', '⡋', '⡊', '⡃', '⡂', '⡉', '⡈', '⡁', '⡀',
'⠿', '⠾', '⠷', '⠶', '⠽', '⠼', '⠵', '⠴', '⠯', '⠮', '⠧', '⠦', '⠭', '⠬', '⠥', '⠤',
'⠻', '⠺', '⠳', '⠲', '⠹', '⠸', '⠱', '⠰', '⠫', '⠪', '⠣', '⠢', '⠩', '⠨', '⠡', '⠠',
'⠟', '⠞', '⠗', '⠖', '⠝', '⠜', '⠕', '⠔', '⠏', '⠎', '⠇', '⠆', '⠍', '⠌', '⠅', '⠄',
'⠛', '⠚', '⠓', '⠒', '⠙', '⠘', '⠑', '⠐', '⠋', '⠊', '⠃', '⠂', '⠉', '⠈', '⠁', '⠀'
looking good so far.
In Go, this could be a []rune
; if we store it as a string constant we wouldn’t be able to index it
directly because each rune would be stored as three bytes—but we could do a bit of maths and save a bit
of space. We’ll have to measure that at some point. Might as well do it now. Compare
const brailidx = "<... a string made by subtracting 10240 from each of those runes above ...>"
func bit2brail(b uint8) rune {
return 10240 + rune(conv_be[b])
}
with
var brailidx = []rune{ /* ... those runes ... */ }
func bit2brail(b uint8) rune {
return brailidx[b]
}
the first one is ~40% faster:
BenchmarkStringAndMath-12 4848230 246.2 ns/op
BenchmarkRuneSlice-12 3539121 347.3 ns/op
Looking at the assembly output
go build -gcflags -S .
you can notice some wasted cycles checking bounds
of the rune slice, which can be avoided by changing it to an array,
var brailidx = [...]rune{ /* ... those runes ... */ }
which gets them within 20%:
BenchmarkStringAndMath-12 4914315 239.4 ns/op
BenchmarkRuneSlice-12 4165580 289.2 ns/op
I like that in the string version the data is immutable, but I don’t like that it’s a lot more unreadable. Can we get the best of both worlds?
const brailidx = `
⣿ ⣾ ⣷ ⣶ ⣽ ⣼ ⣵ ⣴ ⣯ ⣮ ⣧ ⣦ ⣭ ⣬ ⣥ ⣤ ⣻ ⣺ ⣳ ⣲ ⣹ ⣸ ⣱ ⣰ ⣫ ⣪ ⣣ ⣢ ⣩ ⣨ ⣡ ⣠
... etc ...`
func bit2brail(b uint8) rune {
return []rune(brailidx[4*int(b)+1 : 4*(int(b)+1)])[0]
}
this does unfortunately pay the price of that []rune
conversion and those
pesky bound checks are back,
BenchmarkRuneConst-12 546738 2148 ns/op
and it’s actually faster to do it explicitly,
func bit2brail(b uint8) rune {
n := 4*int(b) + 1
if n < 0 || n > len(brailidxr) {
return '⠀'
}
r, _ := utf8.DecodeRuneInString(brailidxr[n:])
return r
}
but even then it’s the worst option:
BenchmarkStringAndMath-12 4997421 245.6 ns/op
BenchmarkRuneSlice-12 4116244 290.7 ns/op
BenchmarkRuneConst-12 1530085 791.0 ns/op
Regular expressions
Now we just need a way to load an XBM file into something useful. The quickest way for me is to write what we know into a regular expression.
We’ll probably want to come back and revisit it once everything is working and make an actual parser that normal people can understand, but for now just
(?sm)^#define\s+\S+_width\s+(\d+)\s*$
#define\s+\S+_height\s+(\d+)\s*$
(?:#define\s+\S+_x_hot\s+\d+$
#define\s+\S+_y_hot\s+\d+$
)?static\s+(?:unsigned\s+)?char\s+(\S+)_bits\s*\[\]\s*=\s*{\s*$
((?:\s*0x[[:xdigit:]]{2}\s*,)*\s*0x[[:xdigit:]]{1,2})\s*,?\s*\}
should work to load the whole file, and a simple 0x[[:xdigit:]]{1,2}
to then
pull out the hex digits. This will fail for older XBMs (“X10” format) where the
data is in 16-bit numbers instead of 8-bit shown here, but I haven’t found any
of these “in the wild” outside of explicitly asking The Gimp to create one so
I’m not worried at this point.
Future work
Using ImageMagick it’s easy to procedurally create XBMs with individual pixels turned on:
: convert -size 4x4 xc:black -fill white -draw "point 1,2" XBM:
#define _width 4
#define _height 4
static char _bits[] = {
0x0F, 0x0F, 0x0D, 0x0F, };
and so you can create a little mapping to test that each bit in an XBM is picked up properly, and another to test that it is output as the right braille character. This’ll let you dig into the weird and wonderful corner cases, e.g. what happens with a 3×3 or a 7×7 XBM.
Also from here you can look into fuzz testing the XBM reader, prior to moving away from the regexp.
Or, I dunno, just have fun.
: go install chipaca.com/xbmbraille@latest
: convert +dither -font Noto-Emoji -pointsize 64 label:🤩 -trim XBM:- | xbmbraille -n -
⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⣤⣴⣶⠶⠶⠶⠶⢶⣶⣤⣤⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⢀⣴⡾⠟⠋⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠛⠿⣶⣄⡀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⣠⣾⠟⣿⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣴⡟⢿⣦⡀⠀⠀⠀⠀
⠀⠀⢀⣼⠟⠁⢠⣿⣿⣦⢀⣀⣀⣀⠀⠀⠀⠀⢀⣀⣀⣀⢀⣾⣿⣿⠀⠙⢿⣆⠀⠀⠀
⠀⢀⣾⣯⣤⣴⣾⣿⣿⣿⣿⣿⣿⠟⠀⠀⠀⠀⠈⢿⣿⣿⣿⣿⣿⣿⣶⣦⣤⣿⣧⠀⠀
⠀⣾⠏⠙⠻⣿⣿⣿⣿⣿⣿⣿⠁⠀⠀⠀⠀⠀⠀⠀⢹⣿⣿⣿⣿⣿⣿⡿⠟⠉⢻⣇⠀
⢸⡟⠀⠀⠀⠀⣿⣿⣿⣿⣿⣿⣇⠀⠀⠀⠀⠀⠀⢀⣾⣿⣿⣿⣿⣿⡏⠀⠀⠀⠈⣿⡀
⣿⡇⠀⠀⠀⠀⣿⠟⠋⠀⠈⠙⠛⠀⠀⠀⠀⠀⠀⠘⠛⠉⠁⠀⠙⢿⡇⠀⠀⠀⠀⢿⡇
⣿⡇⠀⠀⠀⠀⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣸⡇
⢻⡇⠀⠀⠀⢀⣤⣤⣀⣀⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣀⣀⣠⣤⣄⠀⠀⠀⠀⣿⡇
⠸⣿⠀⠀⠀⠘⣧⣄⣈⣉⡉⠉⠛⠛⠛⠛⠛⠛⠛⠛⠋⠉⢉⣉⣀⣠⡿⠀⠀⠀⢰⡿⠀
⠀⢹⣧⠀⠀⠀⠙⢿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡿⠁⠀⠀⢠⣿⠃⠀
⠀⠀⠹⣷⡀⠀⠀⠀⠙⠿⣿⣿⡿⠟⠛⠋⠉⠛⠛⠿⣿⣿⡿⠟⠉⠀⠀⠀⣠⡿⠃⠀⠀
⠀⠀⠀⠘⢿⣦⡀⠀⠀⠀⠀⠉⠛⠓⠶⠶⠶⠶⠖⠛⠋⠁⠀⠀⠀⠀⢀⣼⠟⠁⠀⠀⠀
⠀⠀⠀⠀⠀⠙⠿⣦⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣠⣾⠟⠁⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠈⠙⠻⢶⣦⣤⣄⣀⣀⣀⣀⣀⣠⣤⣶⠾⠟⠋⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠉⠉⠛⠛⠛⠉⠉⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀