Of course none of this helps those using screen-readers and other tech, so make sure that all your fancy colouring & such is additive so if it is all “lost” no meaning is absolutely lost with it.
--------
[1] Some people can be very vocal about this, more so than if highlighting isn't possible at all. If you give any output formatting they'll expect you to match, or be able to be made to match, their preferred style.
They've been absolutely invaluable for making sure their kind of people can't use my apps properly.
It is not a fun condition to have, and leads to lots of problems in my everyday life. This blog post accidentally accentuated that issue, since the colors are (to what I can understand) very similar looking to me as a colorblind person.
1 in 12 men and 1 in 200 women go through the same sorts of experiences, and it’s worth it, if you aren’t color deficient, to try out some of the colorblindness sites and see the world as we do.
https://www.colourblindawareness.org/colour-blindness/colour...
Almost everyone to an extent loses some colour definition in their vision as we age, even those lucky enough to have excellent colour vision to start with, some lose a lot more than others and it is gradual so mostly not noticed at first. The is one of the reasons many grandparents have the saturation oddly high on their TVs (the other main reason, of course, being they've just never changed it from the default that is picked to make the display “pop” under bright show-room lighting conditions).
It has a little window you can move over the screen to simulate a few varieties of color blindness.
Yours is on the much stronger side of the things.
Not the same, it's a gradient.
In the absence of any naturally occuring colour-blind friends, do you have any tips about surrptitious damaging someone's eyes to create one? :-)
Though there are simulation tools avaliable which do a reasonable job, I'll probably stick to those where I have a concern. That feels less drastic.
This is such an app for iOS / macOS: https://michelf.ca/projects/sim-daltonism/
Why would you do that?
That is better then a random friend, because a.) there are various kinds of colorblindness b.) you wont ask the random friend to work for your company for free.
A filter shader generally won't tell you that substituting white for green in a red/green indicator is a great option, or that a colour that they can “see” is still ambiguous when you have to describe it: “I'm clicking the purple button, but it's not doing anything” “Purple? There's no purple in the app!”
https://www.color-blindness.com/coblis-color-blindness-simul...
.. to get an idea of the impact of your UI design on color-limited folks out there ..
I used this a few times to great effect, it was very revealing to see that my carefully selected teals and ambers were incomprehensible to some folks I really wanted to use my apps .. didn't take much iteration to come to a happy palette though, just needed a bit of care.
Not reconfigure individual settings, turn off syntax highlighting all together.
It's not about style for me, it's about readability.
We have pitch, volume, enunciation speed, and for the voice itself the vocal formant frequency can change as can the harmonics. And that is a rich field we are good at differentiating in too.
One other screen reader idea I had upon seeing this is to use a brief sound either immediately before or after, maybe even slightly overlapping the vocalization.
30 [30 MS BEEP] CO [30 MS BEEP FOLLOWED BY A SHORT CHIRP THAT INDICATES A KNOWN ADDRESS]
Writing that out looks messy. All I can say is the sounds in my head right now make a lot more sense and would compliment the colors nicely.
So by all means "color everything", people have different opinions on what they want colored so give them option
Interesting idea. So even syntax–highlighting natural language. Grammar highlighting, as it were. Prepositions, verbs, question marks, etc. An LLM could do it. Would it actually improve readability though? Seems likely!
Not as fine-grained as individual items of grammar, but we essentially already do this and have for almost as long as writing has been a thing. Headings in bold, things that you want to emphasise in body text in italic or bold, hyperlinks underlined and/or in a different colour, …
Highlighting by grammar might be useful in language analysers/translators, for those of us trying to learn a second, though being able to pick from a selection of rules would be needed for it to be truly useful: sometimes I might want words agreeing with each other (subject->verb, subject->adjectives) in a colour of their own, sometimes specific word types (is abierto the past participle of abrir here, or the adjective?). Or verb tenses. You could perhaps do both tense and agreement, highlighting the stem by tense and the suffix with the same colour as the subject, and object pronouns the same colour as any relevant adjectives, but this is likely to make the colouring system too complex to be useful at a glance. On anything more than a single simple sentence something more dynamic might be better here, not highlighting anything by default but when something is hovered over have it and relevant things spring into colour appropriately, you'll only need small set of colours in that case rather than one for each subject/object in a longer paragraph with trying to match the same subject/object to a set colour consistently throughout.
Highly unlikely.
https://www.linusakesson.net/programming/syntaxhighlighting/
(Look at what else he's done, if you doubt the value of his opinion.)
Here's Claud attacking your post:
Its a hex editor built with imgui and has a lot of built in tools. Imo the best feature is the data structure editor. You can write a data type definition similar to C and it overlays it on the hexdump and parses it in a structured way while you type.
It also has a node based editor.
How many rules would it take to create 256 different colors for 00..FF?
But - it's also kind of huge for a hex editor. Wouldn't it be overkill for most people?
It really is by far the best hex editor I ever used, and sooo good for reversing arbitrary binary blobs where you learn incrementally more about its structure while reversing it. The imhex patterns repo [1] also contains so many formats, it makes binwalk almost useless in comparison.
This is another fine tool I can add to my collection.
And FUTO! Love it.
We were given a file full of random bytes. The flag was in there somewhere. It was too random to be encrypted, there wasn't any structure. `file` didn't return anything, truly just a bag of bytes.
I had decided to install `hexyl` as an alternative option to some of the other hex editors installed o my linux machine. All the bytes were colored grey.
I scrolled the file and noticed a blip of yellow. A random golden `{` amongst all the noise. Weird.
The next colored byte was a `C`, then `T`, `F`.
---
At that time, I was mostly using HexFiend to look at raw files, which didn't have byte coloring. For DEFCON I had decided to drive my linux machine. I had ghex installed, but i had also decided to install and try `hexyl` via cli. So seeing bytes in color was purely by chance that I had installed it. I eventually posted an issue to ghex to add color support. https://gitlab.gnome.org/GNOME/ghex/-/issues/60
I need to see if I can find the file and post it on that blog post. https://bwiggs.com/posts/2023-08-31-hacking-in-color/
That's a rather odd remark.
Compare a random data from a pseudo gen, a really random data and some encrypted data. They are all different.
color-coding might be a great solution, but you don't really know beforehand which byte values are important. Manually selecting C0 to make it stand out it just ctrl+f with extra steps. (But I wouldn't mind something like "color 00 separate from ascii separate from the rest)
That's not what they did, actually. C0 is the only byte in there that's above 3F or so, and it's far from it. Hence the very different colour, and the lack of contrast between the colours of the other bytes.
We even have characters in the Unicode for representing 0..255 variations, actually two distinct groups: Braille (arguably a bit misuse for binary) and octants (accompanied by older predecessors). So what would be
|65|97|66|98|67|99|32|126|32|72|101|108|108|111|44|32|109|111|109|33|32|240|159|166|132|
in base-10 or |41|61|42|62|43|63|20|7e|20|48|65|6c|6c|6f|2c|20|6d|6f|6d|21|20|f0|9f|a6|84|
in base-16, could be |⢈|⢊|⡈|⡊|⣈|⣊|⠂|⡾|⠂|⠌|⢪|⠮|⠮|⣮|⠦|⠂|⢮|⣮|⢮|⢂|⠂|⠛|⣵|⡣|⠡|
in Braille, or ||||||||||||||||||||||(⁕)||||
using octants.Most significant bit is at the top left here, the least one is bottom right -- it felt somewhat intuitive to me this way, your intuition may differ, obviously.
Or, naturally, "AaBbCc ~ Hello, mom! <Unicorn Emoji>" as a "UTF-8" text.
Try: http://myfonj.github.io/tst/byte-dec-hex-braille-octant.html) Test (with added "CSS" variant and "highlight" of empty dots): http://myfonj.github.io/tst/byte-visualisation-exploration.h...
(⁕) HN apparently eats upper-half block. Amusing that only this particular ("old", as referred earlier) one got filtered out…
Also caveat: Android phones have messed-up Braille block due outdated broken embedded font, so all patterns with dots in the left half appear in the right instead. Long reported, not fixed, IIRC.
By the way, the "octants" representation is unreadable to me and I have HN at like 140% zoom.
One thing I often ponder on, along similar lines, is whether I can write some clever plugin that would put FF ChartWell - a font which uses ligatures to render useful graphs out of boring numerical data - into use, within ImHex. Seen how ligatures can be used this way?
https://typographica.org/typeface-reviews/chartwell/
Your idea of discovering new means of representing numeral data put me in mind of ligature hacking, in any case ...
dd if=/dev/urandom bs=$[256*256] count=1 | display -size 256x256 -depth 8 GRAY:-
You can do the same thing with audio, which makes different sorts of patterns obvious, e.g. dd if=/dev/urandom bs=$[256*256] count=1 | aplay -c 1This plays sounds, sounds like "Liberate Mae?"
If that’s true, how does the tool know I will be looking for C0 bytes and not for 03, D3, etc? The logical conclusion of that would be that the hex editor should uniquely color code every byte. And following the other examples even that’s not enough.
The proposed solution is to create groups of byte values that each get their unique color. I think that helps, but we can do better: add a search feature. That tells your editor what you are looking for. Once you enter a search string, it can highlight all hits.
Yes, “colorful output in a hexdump is useful for the same reason that syntax highlighting for code is useful”, but do you know what syntax highlighting needs? Knowledge of the expected content of a file. Without that, a hex editor at best can guess at how to color-code stuff.
IMO, if you want to add syntax coloring to a hex editor, give it pluggable syntax coloring and heuristics for deciding which one to use when.
While at it, also let those plugins control where to break lines, whether to show hex at all (why show it at all if a file has a few paragraphs of English text or an array of IEEE doubles?), etc.
Those plug-ins will make errors and sometimes, users will want to see all byte values, so you’ll need a way for the user to override them.
As a note, the some up there is load-bearing - color may lull you into complacency where the difference between 01 and 0F is major and important but not highlighted. More complicated regex built color tools designed to highlight "anomalies" could be developed but then you need to define what anomalies are (patters, places where a pattern changes, etc).
> Your hex editor should colour-code bytes so it is easier for users to distinguish patterns
> Article is fully in lowercase, which makes it harder for readers to make out sentences and the flow of the article
> mfw the ironyI wrote this a LONG while back. At the time, I was fresh out of college and my first job included a lot of reverse engineering communication protocols to make machines work together for automation purposes. I will personally testify to how useful it was to see visual patterns to aid in this. The single biggest benefit was seeing that one particular protocol switched endianness WITHIN a specific packet.
Another option would be to load data in pandas and display it in a Jupyter notebook with style.background_gradient()
Polars delegate styling to Great Tables, but it's also doable there: https://posit-dev.github.io/great-tables/get-started/coloriz...
The post put on the table an interesting point about how to improve the presentation layer to fit what’s human cognition is good at spotting (in general, or at least for the expected audience with some training). And it does start proposing something with these color schemes. But isn’t it kind of missing the forest for the tree? Actually why do we even have rendering with [012345678ABCDEF], when a specific set of (colored/imaged?) glyphs would be able to make more obvious what’s on the table? Or even beyond the hexadecimal grouping, wouldn’t be more relevant to render something "intuitively" far more easy to grap without several layer of internalized interpretation through acculturation?
Of course, if you know about the format, there are better ways, but it goes beyond the scope of a hex editor, though the most advanced ones support things like template files and can display structured data, disassembly, etc...
Most of us have internalized the relationship between digits in [0-9] for a very long time. Adding 6 more glyphs after that is quite easy (and they're also somewhat well known in the world), and after a while you stop even thinking about the glyphs consciously anyway. A hex 'C' intuitively means to me '4 from the end'. A hex 'F' intuitively means to me 'all 4 bits are set to 1'. I don't see any advantage to switching to a different glyph set for this base, other than disruption for disruption's sake.
> Or even beyond the hexadecimal grouping, wouldn’t be more relevant to render something "intuitively" far more easy to grap without several layer of internalized interpretation through acculturation?
Modern computers deal with 8-bit bytes, and their word sizes are a multiple of bytes - unless you're dealing with bit-packed data, which is comparatively rare (closest is bit twiddling of MMIO registers, which is when you sometimes switch to binary; although for a 4-bit hex nibble you can still learn arbitrary combinations of bits on/off into its value).
This means you can group 8 bits into 1 digits of 8 bits as one glyph (alphabet too large to be useful), 2 digits of 4 (hex), 4 digits of 2 (alphabet too small to give a benefit over binary) and 8 digits of 1 (binary). Hex just works really well as a practical middle ground.
Back when computers used 12 bit words (PDP-8 and friends) octal (4 digits of 3 bits represented in the 0-7 alphabet) was more popular.
Then byte was represented as 16x16 matrix where each 4x4 area had the lower digit pattern, and these were arranged in the shape of the higher digit.
But at the end of the day, it wasn't really more readable.
I'd want to take it further by using full RGB and cycling through some colormaps with different properties. Sequential, diverging, cyclic like in matplotlib.
https://matplotlib.org/stable/users/explain/colors/colormaps...
Can't think of a specific use-case off the top of my head, but sometimes I just want the "feel" of the data when I'm plotting something, and maybe the same scattershot approach would pay off at some point on unknown hex data if it was an option.
I'll pass thank you
https://github.com/0xfalafel/hextazy
Some nice features are:
- robust undo
- insertions (not just overwriting)
- inspect the value of selected bytes
- search
I have also recently discover this other project that is pretty cool: https://github.com/mentebinaria/dz6
[1] https://rizin.re
https://soegaard.github.io/peek/#%28part._binary-files%29
For me the key insight is that similar values should get similar colors. And since Fx and 0x are "similar" the color palette should be cyclic.
Different colouring schemes for different types of data.
The implicit cost here is that the simple patterns become harder to recognize when every byte is only subtly differently colored. Rather than give everything a different color, I'd rather have the important stuff highlighted.
In the comparisons given, I think hexyl's highlighting scheme is significantly more useful.
[1]: ide.kaitai.io
I grant that the post largely has a point, mind you. But scanning for a needle in a haystack is something that you just don't often do?
I am, of course, now very curious how often folks are using hex editors. And itching for an excuse to open a file that way. :D
The colors make it worse as I'm red-green colorblind. Looking at that mess is eye strain.
Honestly I mostly prefer syntax highlighting turned off as it causes eye strain. I have found the black on light yellow theme of the Acme editor to be a very comfortable monochrome color scheme.
Ctrl+C, Ctrl+F, Ctrl_V... Easy!
The cool thing about it imo (outside of colors) is a `--windows` flag. Which separates the hex view into partitions: so `-w 2:-3:5` shows the first two bytes on a line, then skips three bytes, then shows the next 5 bytes on a line, then the rest of the file. Easy to use combined with a terminal's up arrow.
But color would be nice more based on the bytes logic.
Eventually the 00 in a shaded grey instead of black, and in best case scenario by logic unit based on your protocol. And worst case scenario by groups of words or so.
excuse me? "basic" and "runs in your browser" together sound very contradictory to me. while doing things i actually feel (yes, emotionally) much better when there is no browser open on my machine, but only text editors, vcs gui and file managers, and terminals of course. and sometimes i reject an idea to start a browser just thinking how much ram it will take (ha, what a progress we have done - one github issue tab, with text only and no images, takes 180mb of ram).
b'\x100'
Which was not obvious to me in the print output that it is \x10 and a literal 0.Don't really see the advantage. Unique bytes have no unique meaning across data types.
The only good syntax highlight to me is 00 and perhaps FF. But that's my opinion of course.
Anything else that has no direct relation to what you're looking at is meaningless.
Would probably make the most sense to have various ranges you can enable depending on what you’re looking for (or to look for patterns) e.g. for single byte coloration I could see
- nul
- printable / non-printable ascii
- non-ascii
- UTF8 leading / continuation
- separators
- start/end pairs (both printable and non printable)
Hexmode https://github.com/fidian/hexmode or vim-deadc0de https://www.vim.org/scripts/script.php?script_id=6033
With decent color support? No.
It's been a while since I used hexedit on Linux, but I think that highlighted search results in reverse colours, just like less does for text search. Personally, I'd prefer that to colours.