• mina86.com

  • Categories
  • Code
  • Contact
  • Computer Science vs Reality

    Robin: ‘Let’s use a linked li—’; Batman: *slaps Robin* ‘Vector is faster’

    Some years ago, during a friendly discussion about C++, a colleague challenged me with a question: what’s the best way to represent a sequence of numbers if delete operation is one that needs to be supported. I argued in favour of a linked list suggesting that with sufficiently large number of elements, it would be much preferred.

    In a twist of fate, I’ve been recently discussing an algorithm which reminded my of that conversation. Except this time I was the one arguing against a node-based data structure. Rather than ending things at a conversation, I’ve decided to benchmark a few solutions to make sure which approach is the best.

    The problem

    The task at hand is simple. Design a data structure which stores a set of words, all of the same length, and offers lookup operation which returns all words matching globs in the form ‘prefix*suffix’. That is, words which start with a given prefix and end with a given suffix. Either part of the pattern may be empty and their concatenation is never longer than length of the words in the collection. Initialisation time and memory footprint are not a concern. Complexity of returning a result can be assumed to be constant.

    In this article I’me going to describe possible solutions — some using a boring vector while others taking advantage of an exciting prefix tree — and benchmark the implementations in an ultimate battle between contiguous-memory-based and a node-based containers.

    Embrace the Bloat

    ‘I’m using slock as my screen locker,’ a wise man once said. He had a beard so surely he was wise.

    ‘Oh?’ his colleague raised a brow intrigued. ‘Did they fix the PAM bug?’ he prodded inquisitively. Nothing but a confused stare came in reply. ‘slock crashes on systems using PAM,’ he offered an explanation and to demonstrate, he approached a nearby machine and pressed the Return key.

    Screens, blanked by a locker a few minutes prior, came back to life, unlocked without the need to enter the password.

    The L*u*v* and LChuv colour spaces

    I’ve written about L*a*b* so it’s only fair that I’ll also describe its twin sister: the L*u*v* colour space (a.k.a. CIELUV). The two share a lot in common. For example, they use the same luminance value, base their chromaticity on opponent process theory and each of them has a corresponding cylindrical LCh coordinate system. Yet, despite those similarities — or perhaps because of them — the CIELUV colour space is often overlooked.

    Photo of a panther chameleonu* and v* channels of the photo
    Fig. 1 Picture of a chameleon with its decomposition into L*, u* and v* channels. Photo by Dr Pratt Datta.

    Even though L*a*b* is getting all the limelight, L*u*v* model has its advantages. Before we start comparing the two colour spaces, let’s first go through the conversion formulæ.

    Most vexing parse

    Here’s a puzzle: What does the following C++ code output:

    #include <cstdio>
    #include <string>
    
    struct Foo {
    	Foo(unsigned n = 1) {
    		std::printf("Hell%s,", std::string(n, 'o').c_str());
    	}
    	~Foo() {
    		std::printf("%s", " world");
    	}
    };
    
    static constexpr double pi = 3.141592653589793238;
    
    int main(void) {
    	Foo foo();
    	Foo bar(unsigned(pi));
    }

    Will the real ARG_MAX please stand up? Part 2

    In part one we’ve looked at the ARG_MAX parameter on Linux-based systems. We’ve established experimentally how it limits arguments passed to programs and what influences the value. This time, we’ll look directly at the source to verify our findings and see how ARG_MAX looks from the point of view of system libraries and kernel itself.

    Pro tip: Start your passwords with /!

    Anyone who uses a screen locker surely can recall a situation where they approached their computer and started typing their password to unlock it even though it was never locked. Even if the machine is configured to lock automatically after a period of inactivity, there may be situations when power saving blanks the monitor even before the automatic locking happens.

    If one’s lucky, they realise their mistake in time before hitting Return in a chat window. It’s not uncommon however that one ends with the password blasted into ether over IRC or collaborative editing software; lazy people might ignore the secret getting saved in their shell history file but even that should facilitate, often annoying, password change.

    What if I told you there’s a way to avoid those problems? A one simple trick which will eliminated at least some forms of possible leaks. Simply prefix all your passwords with /! (slash followed by an exclamation mark).

    Dark theme with media queries, CSS and JavaScript

    Split view of Tower Bridge during the day and at night.
    © Franck Matellini

    No, your eyes are not deceiving you. This website has gone through a redesign and in the process gained a dark mode. Thanks to media queries, the darkness should commence automatically according to reader’s system preferences (as reported by the browsers). You can also customise this website in settings panel in top right (or bottom right).

    What are media queries? And how to use them to adjust website’s appearance based on user preferences? I’m glad you’ve asked, because I’m about to describe the CSS and JavaScript magic that enables this feature.

    Media queries overview

    body { font-family: sans-serif; }
    @media print {
    	body { font-family: serif; }
    }

    Media queries grew from the @media rule present since the inception of CSS. At first it provided a way to use different styles depending on a device used to view the page. Most commonly used media types where screen and print as seen in the example on the right. Over time the concept evolved into general media queries which allow checking other aspects of the user agent such as display size or browser settings. A simple stylesheet respecting reader’s preferences might be as simple as:

    body {
    	/* Black-on-white by default */
    	background: #fff;
    	color: #000;
    }
    @media (prefers-color-scheme: dark) {
    	/* White-on-black if user prefers dark colour scheme */
    	body {
    		background: #000;
    		color: #fff;
    	}
    }

    That’s enough to get us started but not all browsers support that feature or provide a way for the user to specify desired mode. For example, without a desktop environment Chrome will report light theme preference and Firefox users need to go deep into the bowels of about:config to change ui.systemUsesDarkTheme flag if they are fond of darkness. To accommodate such situations, it’s desirable to provide a JavaScript toggle which defaults to option specified in system settings.

    Fortunately, media can be queried through JavaScript and herein I’ll describe how it’s done and how to marry theme switching with browser preferences detection. TL;DR version is to grab a demonstration HTML file which includes a fully working CSS and JavaScript code that can be used to switch themes on a website.

    sRGB↔L*a*b*↔LChab conversions

    After writing about conversion between sRGB and XYZ colour spaces I’ve been asked about a related process: moving between sRGB and CIELAB (perhaps better known as L*a*b*). As this may be of interest to others, I’ve decided to go ahead and make an article out of it. I’ll also touch on CIELChab which is a closely related colour representation.

    Photo of a panther chameleona* and b* channels of the photo
    Picture of a chameleon with its decomposition into L*, a* and b* channels. Photo by Dr Pratt Datta.

    The L*a*b* colour space was intended to be perceptually uniform. While it’s not truly uniform it’s nonetheless useful and widely used in the industry. For example, it’s the basis of the ΔE*00 colour difference metric. LChab aim to make L*a*b* easier to interpret by replacing a* and b* axes with more intuitive chroma and hue parameters.

    Importantly, the conversion between sRGB and L*a*b* goes through XYZ colour space. As such, the full process has multiple steps with a round trip conversion being: sRGB​→​XYZ​→​L*a*b*​→​XYZ​→​sRGB. Because of that structure I will describe each of the steps separately.

    Will the real ARG_MAX please stand up? Part 1

    arg max is a set of values from function’s domain at which said function reaches its maxima. That’s certainly an arg max but not the one we’re after. No, this article is regarding the ARG_MAX that limits the length of arguments to an executable. Or in other words, why you are getting:

    bash: command: Argument list too long

    HTML: No, you don’t need to escape that

    This website being my personal project allows me to experiment and do things I’d never do in professional settings. Most notably, I’m rather fond of trying everything I can to reduce the size of the page. This goes beyond mere minification and eventually lead me to wonder if all those characters I’ve been escaping in HTML code require such treatment.

    Libraries offering HTML support will typically provide a function to indiscriminately replace all ampersands, quote characters, less-than and greater-then signs with their corresponding HTML-safe representation. This allows the result to be used in any context in the document and is a good choice for user-input validation. It’s a different matter when it comes to squeezing every last byte. Herein I will explore which characters and under what conditions need to be escaped in an HTML document.