Articles — mina86.comhttp://mina86.com/atom/cat/articles/content/html/2023-04-01T13:13:13ZMichał ‘mina86’ Nazarewiczhttps://mina86.comRAII: Tragedy in three actshttp://mina86.com/2023/04/01/raii-c-cpp-and-rust2023-04-01T13:13:13Z2023-04-01T13:13:13ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p>In <a href="//www.youtube.com/watch?v=pTMvh6VzDls">a recent Computerphile video</a>, Ian Knight talked about RAII idiom and it’s application in C++ and Rust. While the video described the general concepts, I felt different examples could be more clearly convey essence of the topic.<p>I’ve decided to give my own explanation to hopefully better illustrate what RAII is and how it relates to Rust’s ownership. Then again, for whatever reason I’ve decided to write it as a play with dialogue in faux Old English so it may well be even more confusing instead.<style>.play h3{font-size:1.25em;margin:0}.play td{vertical-align:top}.play td:not(:last-child){font-size:0.875em;font-variant:small-caps;text-align:right;text-transform:uppercase;}.play td:not(:last-child) abbr{font-variant:small-caps;text-transform:uppercase;}.play td:last-child{text-align:justify;}.dir{font-style:oblique}</style><table class=play><tbody><tr><td><td><h2>Cast of characters</h2><tr><td><td class=sm>(In the order of appearance)<tr><td>Gregory<td>A software engineer and Putuel’s employee number #1<tr><td>Sampson<td>A software engineer and a self-proclaimed 10× developer<tr><td>Paris<td>An apprentice returning to Putuel two summers in a row<tr><td>CTO<td>Puteal’s Chief Technical Officer spending most of his time in meetings<tr><td>Admin<td>Administrative assistant working in Puteal Corporation’s headquarters in Novear<tbody><tr><td><td><h2>Act I</h2><tbody><tr><td><h3>Scene I</h3><td>Novear. A public place.<tr><td><td class=dir>Enter Sampson and Gregory, two senior engineers of the Puteal Corporation, carrying laptops and phones<tr><td>Gregory<td>Pray tell, what doth the function’s purpose?<tr><td>Sampson<td>It doth readeth a number from a file. A task as trivial as can be and yet QA reports memory leak after my change. Hence, I come to thee for help.<tr><td><td class=dir>Both look at a laptop showing code Sampson has written [error handling omitted for brevity from all source code listings]:<tr><td colspan=2><pre>double read_double(FILE *fd) {
char *buffer = malloc(1024); <i>/* allocate temporary buffer */</i>
fgets(buffer, 1024, fd); <i>/* read first line of the file */</i>
return atof(buffer); <i>/* parse and return the number */</i>
}</pre><tr><td>Gregory<td>Thine mistake is apparent. Thou didst allocate memory but ne’er freed it. Verily, in C thou needs’t to explicitly free any memory thou dost allocate. Submit this fix and thy code shall surely pass.<tr><td colspan=2><pre>double read_double(FILE *fd) {
char *buffer = malloc(1024); <i>/* allocate temporary buffer */</i>
fgets(buffer, 1024, fd); <i>/* read first line of the file */</i>
double result = atoi(buffer); <i>/* parse the line */</i>
<ins>free(buffer); <i>/* free the temporary buffer */</i></ins>
return result; <i>/* return parsed number */</i>
}</pre><tbody><tr><td><h3>Scene II</h3><td>A hall.<tr><td><td class=dir>Enter Puteal CTO, an apprentice called Paris and an Admin<tr><td>Paris<td>I’ve done as Sampson beseeched of me. I’ve taken the <code>read_double</code> function and changed it so that it doth taketh file path as an argument. He hath warned me about managing memory and so I’ve made sure all temporary buffers are freed. Nonetheless, tests fail.<tr><td colspan=2><pre>double read_double(const char *path) {
FILE *fd = fopen(path, "r"); <i>/* open file */</i>
<small>char *buffer = malloc(1024);</small>
<small>fgets(buffer, 1024, fd);</small>
<small>double result = atof(buffer);</small>
<small>free(buffer);</small>
<small>return result;</small>
}</pre><tr><td>CTO<td>Thou didst well managing memory, but memory isn’t the only resource that needs to be freed. Just like allocations, if thou dost open a file, thou must close it anon once thou art done with it.<tr><td><td class=dir>Exit CTO and Admin towards sounds of a starting meeting<tr><td>Paris<td>Managing resources is no easy task but I think I’m starting to get the hang of it.<tr><td colspan=2><pre>double read_double(const char *path) {
FILE *fd = fopen(path, "r"); <i>/* open file */</i>
<small>char *buffer = malloc(1024);</small>
<small>fgets(buffer, 1024, fd);</small>
<ins>fclose(fd); <i>/* close the file */</i></ins>
<small>double result = atof(buffer);</small>
<small>free(buffer);</small>
<small>return result;</small>
}</pre><tbody><tr><td><h3 na>Scene III</h3><td>Novear. A room in Puteal’s office.<tr><td><td class=dir>Enter Paris and Sampson they set them down on two low stools, and debug<tr><td>Paris<td>The end of my apprenticeship is upon me and yet my code barely doth work. It canst update the sum once but as soon as I try doing it for the second time, nothing happens.<tr><td colspan=2><pre>double update_sum_from_file(mtx_t *lock,
double *sum,
const char *path) {
double value = read_double(path); <i>/* read next number from file */</i>
mtx_lock(lock); <i>/* reserve access to `sum` */</i>
value += sum->value; <i>/* calculate sum */</i>
sum->value = value; <i>/* update the sum */</i>
return value; <i>/* return new sum */</i>
}</pre><tr><td>Sampson<td>Thou hast learned well that resources need to be acquired and released. But what thou art missing is that not only system memory or a file descriptor are resources.<tr><td>Paris<td>So just like memory needs to be freed, files need to be closed and locks needs to be unlocked!<tr><td colspan=2><pre>double update_sum_from_file(mtx_t *lock,
double *sum,
const char *path) {
double value = read_double(path); <i>/* read next number from file */</i>
mtx_lock(lock); <i>/* reserve access to `sum` */</i>
value += *sum; <i>/* calculate sum */</i>
*sum = value; <i>/* update the sum */</i>
<ins>mtx_unlock(lock); <i>/* release `sum` */</i></ins>
return value; <i>/* return new sum */</i>
}</pre><tr><td rowspan=2>Paris<td>I’m gladdened I partook the apprenticeship. Verily, I’ve learned that resources need to be freed once they art no longer used. But also that many things can be modelled like a resource.<tr><td>I don’t comprehend why it all needs to be done manually?<tr><td><td class=dir>Exit Sampson while Paris monologues leaving him puzzled<tbody><tr><td><td><h2>Act II</h2><tbody><tr><td><h3>Scene I</h3><td>Court of Puteal headquarters.<tr><td><td class=dir>Enter Sampson and Paris bearing a laptop before him<tr><td>Paris<td>Mine last year’s apprenticeship project looks naught like mine own handiwork.<tr><td>Sampson<td>Thou seest, in the year we migrated our code base to C++.<tr><td>Paris<td>Aye, I understandeth. But I spent so much time learning about managing resources and yet the new code doth not close its file.<tr><td><td class=dir>Enter Gregory and an Admin with a laptop. They all look at code on Paris’ computer:<tr><td colspan=2><pre>double read_double(const char *path) {
std::fstream file{path}; <i>/* open file */</i>
double result; <i>/* declare variable to hold result */</i>
file >> result; <i>/* read the number */</i>
return result; <i>/* return the result */</i>
}</pre><tr><td>Sampson<td>Oh, that’s RAII. <dfn>Resource Acquisition Is Initialisation</dfn> idiom. C++ usetht it commonly.<tr><td rowspan=2>Gregory<td>Resource is acquired when object is initialised and released when it’s destroyed. The compiler tracks lifetimes of local variables and thusly handles resources for us.<tr><td>By this method, all manner of resources can be managed. And forsooth, for more abstract concepts without a concrete object representing them, such as the concept of exclusive access to a variable, a guard class can be fashioned. Gaze upon this other function:<tr><td colspan=2><pre>double update_sum_from_file(std::mutex &lock,
double *sum,
const char *path) {
double value = read_double(path); <i>/* read next number from file */</i>
std::lock_guard<std::mutex> lock{mutex}; <i>/* reserve access to `sum` */</i>
value += *sum; <i>/* calculate sum */</i>
*sum = value; <i>/* update the sum */</i>
return value; <i>/* return new sum */</i>
}</pre><tr><td>Paris<td>I perceive it well. When the <code>lock</code> goes out of scope, the compiler shall run its destructor, which shall release the mutex. Such was my inquiry yesteryear. Thus, compilers can render managing resources more automatic.<tbody><tr><td><h3>Scene II</h3><td>Novear. Sampson’s office.<tr><td><td class=dir>Enter Gregory and Sampson<tr><td>Sampson<td>Verily, this bug doth drive me mad! To make use of the RAII idiom, I’ve writ an <code>nptr</code> template to automatically manage memory.<tr><td colspan=2><pre>template<class T>
struct nptr {
nptr(T *ptr) : ptr(ptr) {} <i>/* take ownership of the memory */</i>
~nptr() { delete ptr; } <i>/* free memory when destructed */</i>
T *operator->() { return ptr; }
T &operator*() { return *ptr; }
private:
T *ptr;
};</pre><tr><td>Gregory<td>I perceive… And what of the code that bears the bug?<tr><td>Sampson<td>'Tis naught but a simple code which calculates sum of numbers in a file:<tr><td colspan=2><pre>std::optional<double> try_read_double(nptr<std::istream> file) {
double result;
return *file >> result ? std::optional{result} : std::nullopt;
}
double sum_doubles(const char *path) {
nptr<std::istream> file{new std::fstream{path}};
std::optional<double> number;
double result = 0.0;
while ((number = try_read_double(file))) {
result += *number;
}
return result;
}</pre><tr><td><td class=dir>Enter Paris with an inquiry for Sampson; seeing them talk he pauses and listens in<tr><td rowspan=2>Gregory<td>The bug lies in improper ownership tracking. When ye call the <code>try_read_double</code> function, a copy of thy <code>nptr</code> is made pointing to the file stream. When that function doth finish, it frees that very stream, for it believes that it doth own it. Alas, then you try to use it again in next loop iteration.<tr><td>Why hast thou not made use of <code>std::unique_ptr</code>?<tr><td>Sampson<td>Ah! I prefer my own class, good sir.<tr><td>Gregory<td>Thine predicament would have been easier to discern if thou hadst used standard classes. In truth, if thou wert to switch to the usage of <code>std::unique_ptr</code>, the compiler would verily find the issue and correctly refuse to compile the code.<tr><td colspan=2><pre>std::optional<double> try_read_double(<ins>std::unique_ptr<std::istream></ins> file) {
double result;
return *file >> result ? std::optional{result} : std::nullopt;
}
double sum_doubles(const char *path) {
<ins>auto file = std::make_unique<std::fstream>(path);</ins>
std::optional<double> number;
double result = 0.0;
while ((number = try_read_double(file))) { <i>/* compile error */</i>
result += *number;
}
return result;
}</pre><tr><td><td class=dir>Exit Gregory, exit Paris moment later<tbody><tr><td><h3 na>Scene III</h3><td>Before Sampson’s office.<tr><td><td class=dir>Enter Gregory and Paris, meeting<tr><td>Paris<td>I’m yet again vexed. I had imagined that with RAII, the compiler would handle all resource management for us?<tr><td>Gregory<td>Verily, for RAII to function, each resource must be owned by a solitary object. If the ownership may be duplicated then problems shall arise. Ownership may only be moved.<tr><td>Paris<td>Couldn’t compiler enforce that just like it can automatically manage resources?<tr><td>Gregory<td>Mayhap the compiler can enforce it, but it’s not a trivial matter. Alas, if thou art willing to spend time to model ownership in a way that the compiler understands, it can prevent some of the issues. However, thou wilt still require an escape hatch, for in the general case, the compiler cannot prove the correctness of the code.<tr><td><td class=dir>Exit Gregory and Paris, still talking<tbody><tr><td><td><h2 na>Act III</h2><tbody><tr><td><h3>Scene I</h3><td>A field near Novear.<tr><td><td class=dir>Enter Gregory and Paris<tr><td>Gregory<td>Greetings, good fellow! How hast thou been since thy apprenticeship?<tr><td>Paris<td>I’ve done as thou hast instructed and looked into Rust. It is as thou hast said. I’ve recreated Sampson’s code and the compiler wouldn’t let me run it:<tr><td colspan=2><pre>fn try_read_double(rd: Box<dyn std::io::Read>) -> Option<f64> {
todo!()
}
fn sum_doubles(path: &std::path::Path) -> f64 {
let file = std::fs::File::open(path).unwrap();
let file: Box<dyn std::io::Read> = Box::new(file);
let mut result = 0.0;
while let Some(number) = try_read_double(file) {
result += number;
}
result
}</pre><tr><td>Gregory<td>Verily, the compiler hath the vision to behold the migration of file’s ownership into the realm of <code>try_read_double</code> function during the first iteration and lo, it is not obtainable any longer by <code>sum_doubles</code>.<tr><td colspan=2><pre>error[E0382]: use of moved value: `file`
let file: Box<dyn std::io::Read> = Box::new(file);
---- move occurs because `file` has type `Box<dyn std::io::Read>`,
which does not implement the `Copy` trait
let mut result = 0.0;
while let Some(number) = try_read_double(file) {
^^^^ value moved here, in previous
iteration of loop</pre><tr><td>Paris<td>Alas, I see not what thou hast forewarned me of. The syntax present doth not exceed that which wouldst be used had this been writ in C++:<tr><td colspan=2><pre>fn try_read_double(rd: &dyn std::io::Read) -> Option<f64> {
todo!()
}
fn sum_doubles(path: &std::path::Path) -> f64 {
let file = std::fs::File::open(path).unwrap();
let file: Box<dyn std::io::Read> = Box::new(file);
let mut result = 0.0;
while let Some(number) = try_read_double(&*file) {
result += number;
}
result
}</pre><tr><td>Gregory<td>Verily, the Rust compiler is of great wit and often elides lifetimes. Nonetheless, other cases may prove more intricate.<tr><td colspan=2><pre>struct Folder<T, F>(T, F);
impl<T, F: for <'a, 'b> Fn(&'a mut T, &'b T)> Folder<T, F> {
fn push(&mut self, element: &T) {
(self.1)(&mut self.0, element)
}
}</pre><tr><td>Paris<td>Surely though, albeit this code is more wordy, it is advantageous if I cannot commit an error in ownership.<tr><td>Gregory<td>Verily, there be manifold factors in the selection of a programming tongue. And there may be aspects which may render other choices not imprudent.</table><h2>Aforeword</h2><p>A thing to keep in mind is that the examples are somewhat contrived. For example, the buffer and file object present in <code>read_double</code> function can easily live on the stack. A real-life code wouldn’t bother allocating them on the heap. Then again, I could see a beginner make a mistake of trying to bypass <code>std::unique_ptr</code> not having a copy constructor by creating objects on heap and passing pointers around.<p>In the end, is this better explanation than the one in aforementioned Computerphile video? I’d argue code examples represent discussed concepts better though to be honest form of presentation hinders clarity of explanation. Yet, I had too much fun messing around with this post so here it is in this form.<p>Lastly, I don’t know Old English so the dialogue is probably incorrect. I’m happy to accept corrections but otherwise I don’t care that much. One shouldn’t take this post too seriously.Monospace considered harmfulhttp://mina86.com/2023/03/19/monospace-considered-harmful2023-03-19T16:34:53Z2023-03-19T16:34:53ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p>No, I haven’t gone completely mad yet and still, I write this as an appeal to stop using monospaced fonts for code <small>(conditions may apply)</small>. While fixed-width fonts have undeniable benefits when authoring software, their use is excessive and even detrimental in certain contexts. Specifically, when displaying inline code within a paragraph of text, proportional fonts are a better choice.<h2>The downsides</h2><figure class=fr><svg xmlns=http://www.w3.org/2000/svg version=1.1 width=17.5em height=10em viewbox="0 0 280 160" stroke-width=1><g fill=var(--i) stroke=var(--e)><circle cx=16 cy=12 r=4></circle><circle cx=32 cy=30 r=4></circle><circle cx=80 cy=48 r=4></circle><circle cx=92 cy=66 r=4></circle><circle cx=96 cy=84 r=4></circle><circle cx=132 cy=102 r=4></circle><circle cx=160 cy=120 r=4></circle></g><path stroke=var(--e) d=M20,0v135M140,0v135M260,0v135 /><text font-size=0.75em text-anchor=middle fill=var(--e)><tspan x=20 y=144>4′30″</tspan><tspan x=140 y=144>5′</tspan><tspan x=260 y=144>5′30″</tspan></text><text font-size=0.75em><tspan x=26 y=16>Tahoma</tspan><tspan x=42 y=34>Times New Roman</tspan><tspan x=90 y=52>Verdana</tspan><tspan x=102 y=70>Arial</tspan><tspan x=106 y=88>Comic Sans</tspan><tspan x=142 y=106>Georgia</tspan><tspan x=170 y=124>Courier New</tspan></text></svg><figcaption>Fig. 1. Comparison of time needed to read text set with different fonts. Reading fixed-width Courier New is 13% slower than reading Tahoma.</figcaption></figure><p>Fixed-width fonts for inline code have a handful of downsides. Firstly, text set in such font takes up more space and, depending on the font pairing, individual letters may appear larger. This creates unbalanced look and opportunities for awkward line wrapping.<p>Moreover, a fixed-width typeface <a href=//blog.codinghorror.com/comparing-font-legibility/ >has been shown to be slower to read</a>. Even disregarding the speed differences, switching between two drastically different types of font isn’t comfortable.<p>To make matters worse, many websites apply too many styles to inline code fragments. For example GitHub and GitLab (i) change the font, (ii) decrease its size, (iii) add background and (iv) add padding. This overemphasis detracts from the content rather than enhancing it.<h2>A better way</h2><p id=b1>A better approach is using serif (sans-serif) font for the main text and a sans-serif (serif) font for inline code<a href=#f1>†</a>. Or if serif’s aren’t one’s cup of tea, even within the same font group a pairing allowing for clear differentiation between the main text and the code is possible. For example a humanist font paired with a complementary geometric font.<p id=b2>Another option is to format code with a different colour. To avoid using <a href=//www.w3.org/WAI/WCAG21/Understanding/use-of-color.html>it as the only mean of conveying information</a>, a subtle colour change may be used in conjunction with font change. This is the approach I’ve taken on this blog<a href=#f2>‡</a>.<p>It’s also worth considering whether inline code even needs any kind of style change. For example, the sentence ‘Execution of a C program starts from the main function’ is perfectly understandable whether or not ‘main’ is styled differently.<h2>Epilogue</h2><p>What about code blocks? Using proportional typefaces for them can be done with some care. Indentation isn’t a major problem but some alignment may need adjustments. Depending on the type of code listings, it may be an option. Having said that, I don’t claim this as the only correct option for web publishing.<p>As an aside, what’s the deal with parenthesise after a function name? To demonstrate, lets reuse an earlier example: ‘Execution of a C program starts from the <code>main()</code> function’. The brackets aren’t part of the function name and unless they are used to disambiguate between multiple overloaded functions, there’s no need for them.<p>To conclude, while fixed-width fonts have their place when writing code, their use in displaying inline code is often unnecessary. Using a complementary pairing of proportional typefaces is a better options that can enhance readability. Changing background of inline code is virtually never a good idea.<p id=f1><a href=#b1>†</a> Using serif faces on websites used to carry risk of aliasing reducing legibility. Thankfully, the rise of high DPI displays largely alleviated those concerns.<p id=f2><a href=#b2>‡</a> Combining colour change and typeface change breaks principle of using small style changes. Nonetheless, I believe some leniency for websites is in order. It’s not always guaranteed that readers will see fonts author has chosen making colour change kind of a backup. Furthermore, compared to books, change in colour isn’t as focus-grabbing on the Internet.Chronological order of The Witcherhttp://mina86.com/2022/12/11/witcher-chronological-order2022-12-11T00:54:00Z2022-12-11T00:54:00ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p>Ever since Witcher games took off the franchise skyrocketed in popularity. Books, comics, TV show, games, more comics, another TV show… The story of Geralt and his marry company has been told in so many ways that it’s becoming a wee bit hard to keep track of chronology of all the events; especially across different forms of media.<p>In this article I’ve collected all official Witcher works ordering them in chronological order. To avoid any confusion, let me state up front that if you’re new to the franchise or haven’t read the books yet this list might not be for you. If you’re looking for the order to read the books in, I’ve prepared <a href=/2022/witcher-reading-order/ >a separate article</a> which describes that.<p class=np><a href=/2022/witcher-chronological-order/#chrono>Skip right to the chronology</a><h2>Canons</h2><p>This compilation includes the following works:<ul><li><dfn>Books</dfn> refers to the main story as written by Andrzej Sapkowski in the short stories and novels.<li><dfn>CDPR</dfn> denotes canon of <a href=//www.thewitcher.com/ >the video games released by CD Projekt Red</a>. It’s largely consistent with the books. This category includes games, <a href=//www.darkhorse.com/Search/Browse/%22witcher%22---Books---January+1986-December+2023/Ppydwkt8>comic books published by Dark Horse</a> and cinematics set in game’s continuum.<li><dfn>Netflix</dfn> is continuity of <a href=//www.witchernetflix.com/ >the Netflix TV series</a> and all related spin-offs.<li><dfn>P&P</dfn> indicates Polish <a href="//www.proszynski.pl/search_results.php?keywords=wied%C5%BAmin">comic books by Bogusław Połch and Maciej Parowski</a>. Except for a single issue (‘<i lang=pl>Zdrada</i>’), the comics are faithful adaptations of respective short stories.<li><dfn>Hexer</dfn> refers to <a href=//archive.org/details/TheWitcherTV>the Polish TV series <i lang=pl>Wiedźmin</i> released in 2002</a>.</ul><h3>Netflix</h3><p>Regarding Netflix show. The first season presents split timelines. Each episode can have up to three arcs for each of the main characters: Geralt, Ciri and Yennefer. Stories for each person are presented in chronological order throughout the season, however events between the timelines don’t line up. For example, the second episode shows Yennefer in the year 1206, Geralt in 1240 and Ciri in 1263.<p>The episodes of the main series are given using S<var>nn</var>E<var>mm</var> notation (indicating episode <var>mm</var> of season <var>nn</var>) rather than using their titles. Furthermore, episodes of the first season have name of the character in parenthesise indicating whose arc the entry refers to. For example, the second episode has three separate entries S01E02 (Yen), S01E02 (Geralt) and S01E02 (Ciri).<p>Dates of the events in Netflix show are taken from <a href=//witchernetflix.com/en/map/ rel=nofollow>its official website</a>.<h2>Disclaimer</h2><p>It’s important to note that dates and chronology aren’t always consistent even within a single canon. A common example is <a href="//witcher.fandom.com/wiki/Ciri#Ciri's_age">Ciri’s date of birth</a> which can be interpreted differently based on different parts of the books.<p>Furthermore, because of episodic nature of some of the stories, it’s not always possible to order them in relation to other events. For example, <span title="pl. ‘Ziarno prawdy’">A Grain of Truth</span> could take place pretty much at any point in Geralt’s life.<p>In some cases of ambiguity regarding books and CDPR timelines, I’ve resorted to looking at <a href=//witcher.fandom.com/wiki/Timeline>Witcher fandom wiki</a>.<p>Lastly, dates between canons aren’t consistent. For example, Geralt and Yennefer meet in 1248 in the books but in 1256 in the Netflix show. The compilation orders entries by ‘events’ rather than by dates.<h2>The chronology</h2><style>@media screen{#chrono{width:99vw;margin-left:calc(min(50vw,25rem) - 50vw)}}th.rt{vertical-align:bottom}th.rt span{writing-mode:vertical-rl;transform:rotate(180deg)}th.rt,#chrono tbody td:nth-child(1),#chrono tbody td:nth-child(2),#chrono tbody td:nth-child(3),#chrono tbody td:nth-child(4),#chrono tbody td:nth-child(5){text-align:center;width:1.5em;padding-left:0;padding-right:0}#chrono tbody tr{border-top:1px solid var(--e);color:#000}#chrono tbody tr a,#chrono tbody tr a:visited #chrono tbody tr a:hover,#chrono tbody tr a:focus{color:inherit}.bk{background:#ccebc5}.cd{background:#fbb4ae}.nt{background:#decbe4}.pp{background:#b3cde3}.hx{background:#fed9a6}.ot{background:#ffffcc}.no{font-size:0.875em}.em{/* https://nolanlawson.com/2022/04/08/the-struggle-of-using-native-emoji-on-the-web/ */ font-family:'Twemoji Mozilla','Android Emoji','EmojiOne Color','Segoe UI Emoji','Segoe UI Symbol','Noto Color Emoji','Apple Color Emoji','Noto Color Emoji','Noto Emoji',sans-serif}ul.compact li{margin-top:0;margin-bottom:0}</style><p>To distinguish different forms of media (books, games, shows etc) the following icons are used next to the titles to identify what they refer to:<ul class=compact><li>📜 — short story / 📕 — novel / 🖌 — comic book<li>🕹 — video game / 📱 — mobile game / 🎲 — board game or TTRPG<li>📺 — television series / ア — animated film / 🎥 — cinematic trailer</ul><p class=np>Clicking on canon name in the table’s header allows filtering rows by canon.<table id=chrono><thead><tr><th colspan=5>Canon<th rowspan=2>Title<th rowspan=2>Year<th rowspan=2>Notes<tr id=canons na><th class=rt><span>Books</span><th class=rt><span>CDPR</span><th class=rt><span>Netflix</span><th class=rt><span>P&P</span><th class=rt><span>Hexer</span><tbody><tr class=nt><td><td><td>✓<td><td><td>The Witcher: Blood Origin 📺<td>3<td><tr class=cd><td><td>✓<td><td><td><td><span title="pl. ‘GWINT: Mag Renegat’">GWENT: Rogue Mage</span> 🕹<td>950s<td class=no>Takes place ‘hundreds of years before […] witchers were roaming the Continent’. There’s also accompanying (official?) <a href=//forums.cdprojektred.com/index.php?threads/alzurs-story.11083114/ >Alzur’s Story</a>.<tr class=cd><td><td>✓<td><td><td><td colspan=3>The Witcher: A Witcher’s Journal 🎲<tr class=cd><td><td>✓<td><td><td><td colspan=3><span title="pl. ‘Wiedźmin: Stary Świat’">Witcher: Old World</span> 🎲<tr class=nt><td><td><td>✓<td><td><td>Nightmare of the Wolf ア<td>1107–1109<td class=no>Beginning of the film takes place decades before Geralt’s birth.<tr class=ot><td>?<td><td><td><td><td colspan=2><span title="pl. ‘Droga, z której się nie wraca’">A Road with No Return</span> 📜<td class=no>It’s arguably non-canon with the main connection being one of the main character having the same name as Geralt’s mother.<tr class=pp><td><td><td><td>✓<td><td colspan=2><i lang=pl>Droga bez powrotu</i> 🖌<td class=no>Adaptation of ‘<span title="pl. ‘Droga, z której się nie wraca’">A Road with No Return</span>’ with a slightly different title.<tr class=nt><td><td><td>✓<td><td><td>Nightmare of the Wolf ア<td>1165<td class=no>By the end of the film Geralt is five years old.<tr class=hx><td><td><td><td><td>✓<td colspan=2>E01: <i lang=pl>Dzieciństwo</i> 📺<td class=no>Depicts Geralt at seven years old.<tr class=pp><td><td><td><td>✓<td><td colspan=2><i lang=pl>Zdrada</i> 🖌<td class=no>The story happens with Geralt still training at Kaer Morhen.<tr class=hx><td><td><td><td><td>✓<td colspan=2>E02: <i>Nauka</i> 📺<td class=no>Depicts Geralt’s graduation from Kaer Morhen.<tr class=hx><td><td><td><td><td>✓<td colspan=2>E03: <i>Człowiek – pierwsze spotkanie</i> 📺<td class=no>Takes place immediately after Geralt’s graduation. Contains minor elements of ‘<span title="pl. ‘Mniejsze zło’">The Lesser Evil</span>’ and ‘<span title="pl. ‘Głos rozsądku’">The Voice of Reason</span>’.<tr class=cd><td><td>✓<td><td><td><td colspan=2><span title="pl. ‘Wiedźmin: Krwawy szlak’">The Witcher: Crimson Trail</span> 📱<td class=no>The game takes place soon after Geralt finishes his training.<tr class=nt><td><td><td>✓<td><td><td>S01E02 (Yen) 📺<td>1206<td><tr class=nt><td><td><td>✓<td><td><td>S01E03 (Yen) 📺<td>1210<td><tr class=bk><td>✓<td>✓<td><td><td><td colspan=3><span title="pl. ‘Ziarno prawdy’">A Grain of Truth</span> 📜🖌<tr class=bk><td>✓<td><td><td>✓<td><td colspan=3><span title="pl. ‘Mniejsze zło’">The Lesser Evil</span> 📜🖌<tr class=cd><td><td>✓<td><td><td><td colspan=2><span title="pl. ‘Dom ze szkła’">House of Glass</span> 🖌<td class=no>Geralt speaks of lack of emotions, friends or love which makes me things this happens before <span title="pl. ‘Kraniec Świata’">The Edge of the World</span>.<tr class=nt><td><td><td>✓<td><td><td>S01E01 (Geralt) 📺<td>1231<td class=no>Based on ‘<span title="pl. ‘Mniejsze zło’">The Lesser Evil</span>’.<tr class=cd><td><td>?<td><td><td><td>The Price of Neutrality 🕹<td>1232<td class=no>Part of ‘The Witcher: Enhanced Edition’. Based on ‘<span title="pl. ‘Mniejsze zło’">The Lesser Evil</span>’. The story is told by Dandelion which may be considered unreliable narrator.<tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Kraniec Świata’">The Edge of the World</span> 📜<td>1248<td class=no>Geralt and Jaskier meet for the first time.<tr class=nt><td><td><td>✓<td><td><td>S01E02 (Geralt) 📺<td>1240<td class=no>Based on ‘<span title="pl. ‘Kraniec Świata’">The Edge of the World</span>’.<tr class=nt><td><td><td>✓<td><td><td>S01E04 (Yen) 📺<td>1240<td><tr class=nt><td><td><td>✓<td><td><td>S01E03 (Geralt) 📺<td>1243<td class=no>Based on ‘<span title="pl. ‘Wiedźmin’">The Witcher</span>’.<tr class=nt><td><td><td>✓<td><td><td>S01E04 (Geralt) 📺<td>1249<td class=no>Based on ‘<span title="pl. ‘Kwestia ceny’">A Question of Price</span>’.<tr class=bk><td>✓<td><td><td>✓<td><td><span title="pl. ‘Ostatine życzenie’">The Last Wish</span> 📜🖌<td>1248<td class=no>Geralt and Yennefer meet for the first time.<tr class=nt><td><td><td>✓<td><td><td>S01E05 (Geralt & Yen) 📺<td>1256<td class=no>Based on ‘<span title="pl. ‘Ostatine życzenie’">The Last Wish</span>’.<tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Sezon burz’">Season of Storms</span> 📕<td>ca. 1250<td class=no>1245 according to the date in the book but that’s inconsistent in relation to other books. Note that the interlude and epilogue occur after the events of the saga.<tr class=cd><td><td>✓<td><td><td><td colspan=2><span title="pl. ‘Dzieci lisicy’">Fox Children</span> 🖌<td class=no>Based on chapters 14–15 of <span title="pl. ‘Sezon burz’">Season of Storms</span>.<tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Kwestia ceny’">A Question of Price</span> 📜<td>1251<td><tr class=bk><td>✓<td>✓<td><td><td><td><span title="pl. ‘Wiedźmin’">The Witcher</span> 📜🎥<td>1252<td><tr class=pp><td><td><td><td>✓<td><td><i lang=pl>Geralt</i> 🖌<td>1252<td class=no>Adaptation of ‘<span title="pl. ‘Wiedźmin’">The Witcher</span>’ under a different title.<tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Głos rozsądku’">The Voice of Reason</span> 📜<td>1252<td><tr class=cd><td><td>?<td><td><td><td>Side Effects 🕹<td>1253<td class=no>Part of ‘The Witcher: Enhanced Edition’ set one year after ‘<span title="pl. ‘Wiedźmin’">The Witcher</span>’ short story. The story is told by Dandelion which may be considered unreliable narrator.<tr class=bk><td>✓<td><td><td>✓<td><td><span title="pl. ‘Granica możliwości’">The Bounds of Reason</span> 📜🖌<td>1253<td><tr class=nt><td><td><td>✓<td><td><td>S01E06 (Geralt & Yen) 📺<td>1262<td class=no>Based on ‘<span title="pl. ‘Granica możliwości’">The Bounds of Reason</span>’.<tr class=hx><td><td><td><td><td>✓<td colspan=2>E04: <i>Smok</i> 📺<td class=no>Based on ‘<span title="pl. ‘Granica możliwości’">The Bounds of Reason</span>’.<tr class=bk><td>✓<td><td><td><td><td colspan=2><span title="pl. ‘Okruch lodu’">A Shard of Ice</span> 📜<td><tr class=hx><td><td><td><td><td>✓<td colspan=2>E05: <i>Okruch lodu</i> 📺<td class=no>Based on ‘<span title="pl. ‘Okruch lodu’">A Shard of Ice</span>’.<tr class=hx><td><td><td><td><td>✓<td colspan=2>E06: <i>Calanthe</i> 📺<td class=no>Based on ‘<span title="pl. ‘Kwestia ceny’">A Question of Price</span>’.<tr class=bk><td>✓<td><td><td><td><td colspan=2><span title="pl. ‘Wieczny ogień’">The Eternal Fire</span> 📜<td><tr class=hx><td><td><td><td><td>✓<td colspan=2>E07: <i>Dolina Kwiatów</i> 📺<td class=no>Based on ‘<span title="pl. ‘Wieczny ogień’">The Eternal Fire</span>’ and ‘<span title="pl. ‘Kraniec Świata’">The Edge of the World</span>’.<tr class=bk><td>✓<td><td><td><td><td colspan=2><span title="pl. ‘Trochę poświęcenia’">A Little Sacrifice</span> 📜<td><tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Miecz przeznaczenia’">The Sword of Destiny</span> 📜<td>1262<td><tr class=cd><td><td>✓<td><td><td><td><span title="pl. ‘Racja stanu’">Reasons of State</span> 🖌<td>1262<td class=no>Story happens just after Geralt saves Ciri from the Brokilon dryads.<tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Coś więcej’">Something More</span> 📜<td>1264<td><tr class=nt><td><td><td>✓<td><td><td>S01E07 (Geralt) 📺<td>1263<td><tr class=nt><td><td><td>✓<td><td><td>S01E07 (Yen) 📺<td>1263<td><tr class=nt><td><td><td>✓<td><td><td>S01E01-07 (Ciri) 📺<td>1263<td class=no>Episode 4 based on ‘<span title="pl. ‘Miecz przeznaczenia’">The Sword of Destiny</span>’. Episode 7 based on ‘<span title="pl. ‘Coś więcej’">Something More</span>’.<tr class=nt><td><td><td>✓<td><td><td>S01E08 📺<td>1263<td class=no>Based on ‘<span title="pl. ‘Coś więcej’">Something More</span>’. In this episode all timelines converge.<tr class=hx><td><td><td><td><td>✓<td colspan=2>E08: <i>Rozdroże</i> 📺<td class=no>Based on ‘<span title="pl. ‘Wiedźmin’">The Witcher</span>’ with elements from ‘<span title="pl. ‘Głos rozsądku’">The Voice of Reason</span>’ and ‘<span title="pl. ‘Coś więcej’">Something More</span>’.<tr class=hx><td><td><td><td><td>✓<td colspan=2>E09: <i>Świątynia Melitele</i> 📺<td class=no>Contains elements of ‘<span title="pl. ‘Głos rozsądku’">The Voice of Reason</span>’.<tr class=hx><td><td><td><td><td>✓<td colspan=2>E10: <i>Mniejsze zło</i> 📺<td class=no>Based on ‘<span title="pl. ‘Mniejsze zło’">The Lesser Evil</span>’.<tr class=hx><td><td><td><td><td>✓<td colspan=2>E11: <i>Jaskier</i> 📺<td class=no>Contains elements of ‘<span title="pl. ‘Mniejsze zło’">The Lesser Evil</span>’.<tr class=hx><td><td><td><td><td>✓<td colspan=3>E12: <i>Falwick</i> 📺<tr class=hx><td><td><td><td><td>✓<td colspan=2>E13: <i>Ciri</i> 📺<td class=no>Contains elements of ‘<span title="pl. ‘Coś więcej’">Something More</span>’.<tr class=nt><td><td><td>✓<td><td><td>S02E01 📺<td>1263–1264<td class=no>Based on ‘<span title="pl. ‘Ziarno prawdy’">A Grain of Truth</span>’.<tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Krew elfów’">Blood of Elves</span> 📕<td>1265–1267<td><tr class=nt><td><td><td>✓<td><td><td>S02E02-08 📺<td>1263–1264<td class=no>Based on ‘<span title="pl. ‘Krew elfów’">Blood of Elves</span>’ and parts of ‘<span title="pl. ‘Czas pogardy’">Time of Contempt</span>’.<tr class=nt><td><td><td>✓<td><td><td>S03 📺<td>1264<td class=no>Based on ‘<span title="pl. ‘Krew elfów’">Blood of Elves</span>’ and ‘<span title="pl. ‘Czas pogardy’">Time of Contempt</span>’.<tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Czas pogardy’">Time of Contempt</span> 📕<td>1267<td><tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Chrzest ognia’">Baptism of Fire</span> 📕<td>1267<td><tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Wieża jaskółki’">The Tower of the Swallow</span> 📕<td>1267–1268<td><tr class=bk><td>✓<td><td><td><td><td><span title="pl. ‘Pani jeziora’">Lady of the Lake</span> 📕<td>1268<td><tr class=cd><td><td>✓<td><td><td><td><span title="pl. ‘Wojna krwi: Wiedźmińskie opowieści’">Thronebreaker: Witcher Stories</span> 🕹<td>1267–1268<td class=no>The game takes place concurrently to the saga. Chapters 1–2 happen concurrently to chapters 4–5 of ‘<span title="pl. ‘Czas pogardy’">Time of Contempt</span>’. Chapter 3 happens around chapter 3 and chapter 4 happens concurrently with chapter 7 of ‘<span title="pl. ‘Chrzest ognia’">Baptism of Fire</span>’. Chapter 5 happens concurrently with chapters 6–8 and epilogue happens before chapter 9 of ‘<span title="pl. ‘Pani jeziora’">Lady of the Lake</span>’.<tr class=ot><td>✗<td><td><td><td><td colspan=2><span title="pl. ‘Coś się kończy, coś się zaczyna’">Something Ends, Something Begins</span> 📜<td class=no>Non-canon even though written by Sapkowski.<tr class=cd><td><td>✓<td><td><td><td><span title="pl. ‘Wiedźmin’">The Witcher</span> 🕹<td>1270<td class=no>Dates as given in the games even though it is supposed to take place five years after the saga. To be aligned with the books, add three years.<tr class=cd><td><td>✓<td><td><td><td><span title="pl. ‘Wiedźmin 2: Zabójcy królów’">The Witcher 2: Assassins of Kings</span> 🕹<td>1271<td><tr class=cd><td><td>✓<td><td><td><td colspan=3><span title="pl. ‘Wiedźmin gra przygodowa’">The Witcher Adventure Game</span> 🕹🎲<tr class=cd><td><td>✓<td><td><td><td colspan=3>The Witcher Tabletop RPG 🎲<tr class=cd><td><td>✓<td><td><td><td><span title="pl. ‘Rachunek sumienia’">Matters of Conscience</span> 🖌<td>1271<td class=no>Happens after Geralt deals with Letho. Assumes Iorveth path and dragon being spared.<tr class=cd><td><td>✓<td><td><td><td colspan=2><span title="pl. ‘Jak zabijać potwory’">Killing Monsters</span> 🖌<td class=no rowspan=2>Happens as Geralt travels with Vesemir just before Witcher 3.<tr class=cd><td><td>✓<td><td><td><td colspan=2><a href="//www.youtube.com/watch?v=c0i88t0Kacs"><span title="pl. ‘Jak zabijać potwory’">Killing Monsters</span></a> 🎥<tr class=cd><td><td>✓<td><td><td><td><span title="pl. ‘Wiedźmin 3: Dziki Gon’">The Witcher 3: Wild Hunt</span> 🕹<td>1272<td><tr class=cd><td><td>✓<td><td><td><td colspan=2><span title="pl. ‘Serca z kamienia’">Hearts of Stone</span> 🕹<td><tr class=cd><td><td>✓<td><td><td><td colspan=2><span title="pl. ‘Zatarte wspomnienia’">Fading Memories</span> 🖌<td class=no>This could be place at any point in the timeline.<tr class=cd><td><td>✓<td><td><td><td colspan=2><span title="pl. ‘Klątwa króków’">Curse of Crows</span> 🖌<td class=no>Assumes Ciri becoming a witcher and romance with Yennefer.<tr class=cd><td><td>✓<td><td><td><td colspan=2><span title="pl. ‘Córka płomienia’">Of Flesh and Flame</span> 🖌<td class=no>Directly references events of <span title="pl. ‘Serca z kamienia’">Hearts of Stone</span>.<tr class=cd><td><td>✓<td><td><td><td><span title="pl. ‘Krew i wino’">Blood and Wine</span> 🕹<td>1275<td><tr class=cd><td><td>✓<td><td><td><td colspan=2><a href="//www.youtube.com/watch?v=1-l29HlKkXU">A Night to Remember</a> 🎥<td class=no>Shows the same character as one in ‘<span title="pl. ‘Krew i wino’">Blood and Wine</span>’ expansion.<sup><a href=//twitter.com/MTomaszkiewicz/status/785212276500467714>citation</a></sup><tr class=cd><td><td>✓<td><td><td><td colspan=2><span title="pl. ‘Wiedźmi lament’">Witch’s Lament</span> 🖌<td class=no><tr class=ot><td><td>✗<td><td><td><td colspan=2>The Witcher: Ronin 🖌<td class=no>Alternate universe inspired by Edo-period Japan.</table><script src=/D/pgB6arVO.js></script><h2>Acknowledgments</h2><p>Thanks to acbagel for providing further information <a href=//www.reddit.com/r/witcher/comments/zi9cax/chronology_of_the_witcher_across_all_available/ >on Reddit</a> and in comments below.URLs with <code>//</code> at the beginninghttp://mina86.com/2022/12/04/urls-support-slash-slash2022-12-04T02:09:04Z2022-12-04T02:09:04ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p>A quick reminder that relative URLs can start with a double slash and that this means something different than a single slash at the beginning. Specifically, such relative addresses are resolved by taking the schema (and only the schema) of the website they are on.<p>For example, the code for the link to my repositories in the site’s header is <code><a href="//github.com/mina86">Code</a></code>. Since this page uses <code>https</code> schema, browsers will navigate to <code>https://github.com/mina86</code> if the link is activated.<p>This little trick can save you some typing, but more importantly, if you’re developing a URL parsing code or a crawler, make sure that it handles this case correctly. It may seem like a small detail, but it can have a lasting impact on the functionality of your code.Secret command Google doesn’t want you to knowhttp://mina86.com/2022/11/20/how-to-change-google-language2022-11-20T17:55:45Z2022-11-20T17:55:45ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p style=text-align:center>Or how to change language of Google website.<p>If you’ve travelled abroad, you may have noticed that Google <em>tries</em> to be helpful and uses the language of the region you’re in on its websites. It doesn’t matter if your operating system is set to Spanish, for example; Google Search will still use Portuguese if you happen to be in Brazil.<p>Fortunately, there’s a simple way to force Google to use a specific language. All you need to do is append <code>?hl=<var>lang</var></code> to the website’s address, replacing <var>lang</var> with <a href=//en.wikipedia.org/wiki/List_of_ISO_639-1_codes>a two-letter code</a> for the desired language. For instance, <a href="//www.google.com/?hl=es"><code>?hl=es</code></a> for Spanish, <a href="//www.google.com/?hl=ht"><code>?hl=ht</code></a> for Haitian, or <a href="//www.google.com/?hl=uk"><code>?hl=uk</code></a> for Ukrainian.<p>If the URL already contains a question mark, you need to append <code>&hl=<var>lang</var></code> instead. Additionally, if it contains a hash symbol, you need to insert the string immediately before the hash symbol. For example:<ul><li><code>https://www.google.com/<strong>?hl=es</strong></code><li><code>https://www.google.com/search?q=bread+sandwich<strong>&hl=es</strong></code><li><code>https://analytics.google.com/analytics/web/<strong>?hl=es</strong>#/report-home/</code></ul><p>By the way, as a legacy of Facebook having hired many ex-Google employees, the parameter also work on some of the Facebook properties.<p>This trick doesn’t work on all Google properties. However, it seems to be effective on pages that try to guess your language preference without giving you the option to override it. For example, while Gmail ignores the parameter, you can change its display language in the settings (accessible via the gear icon near the top right of the page). Similarly, YouTube strips the parameter, but it respects preferences configured in the web browser.<p>Anyone familiar with HTTP may wonder why Google doesn’t simply look at <a href=//developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language>the <code>Accept-Language</code> header</a>. Problem is that many users have their browser configured with defaults that send English as the only option, even though they would prefer another language. In those cases, it’s more user-friendly to ignore that header. As it turns out, localisation is hard.Curious case of missing πhttp://mina86.com/2022/06/28/curious-case-of-missing-pi2022-06-28T03:18:53Z2022-06-28T03:18:53ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p>π is one of those constants which pops up when least expected. At the same time it’s sometimes missing when most needed. For example, consider the following application calculating area of a disk (not to be confused with area of a circle which is zero):<pre>#include <math.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
for (int i = 1; i < argc; ++i) {
const double r = atof(argv[i]);
printf("%f\n", M_PI * r * r);
}
}</pre><p>It uses features introduced in the 1999 edition of the C standard (often referred to as C99) so it might be good to inform the compiler of that fact with a <code>-std=c99</code> flag. Unfortunately, doing so leads to an error:<pre>$ gcc -std=c99 -o area area.c
area.c: In function ‘main’:
area.c:8:18: error: ‘M_PI’ undeclared (first use in this function)
8 | printf("%f\n", M_PI * r * r);
| ^~~~</pre><p>What’s going on? Shouldn’t <code>math.h</code> provide the definition of <code>M_PI</code> symbol? It’s what <a href=//pubs.opengroup.org/onlinepubs/9699919799/basedefs/math.h.html>the specification</a> claims after all. ‘glibc is broken’ some may even proclaim. In this article I’ll explain why the compiler conspire with the standard library to behave this way and why it’s the only valid thing it can do.<h2>The problem</h2><p id=b1>First of all, it needs to be observed that the aforecited specification is <em>not</em> the C standard. Instead, it’s POSIX and it marks <code>M_PI</code> with an [XSI] tag. This means that <code>M_PI</code> ‘is part of the X/Open Systems Interfaces option’ and ‘is an extension to the ISO C standard.’ Indeed, <a href=http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf>The C99 standard</a><a href=#f1>†</a> doesn’t define this constant.<p>Trying to support multiple standards, gcc and glibc behave differently depending on arguments. With <code>-std=c99</code> switch the compiler conforms to C standard; without the switch, it includes all the POSIX extensions.<p>A naïve approach would be to make life easier and unconditionally provide the constant. Alas, the <code>M_PI</code> identifier is neither defined nor reserved by the C standard. In other words, programmer can freely use it and a conforming compiler cannot complain. For example, the following is a well-formed C program and C compiler has no choice but to accept it:<pre>#include <math.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
const double M_PI = 22.0 / 7.0;
for (int i = 1; i < argc; ++i) {
const double r = atof(argv[i]);
printf("%f\n", M_PI * r * r);
}
}</pre><p>Should compiler always define <code>M_PI</code> in <code>math.h</code> the above code wouldn’t work.<h2>The solution</h2><p>The developer who needs π constant has a few ways to solve the problem. For maximum portability it has to be defined in the program itself. To make things work even when building on a Unix-like system, an <code>ifdef</code> guard can be used. For example:<pre>⋮
#ifndef M_PI
# define M_PI 3.141592653589793238462643383279502884
#endif
⋮</pre><p>Another solution is to limit compatibility to Unix-like systems. In this case, <code>M_PI</code> constant can be freely used and <code>-std</code> switch shouldn’t be passed when building the program.<h3>Feature test macros</h3><p>glibc provides one more approach. The C standard has a notion of reserved identifiers which cannot be freely used by programmers. They are used for future language development and to allow implementations to provide their own extensions to the language.<p>For example, when C99 added boolean type to the language, it did it by defining a <code>_Bool</code> type. A <code>bool</code>, <code>true</code> and <code>false</code> symbols became available only through a <code>stdbool.h</code> file. Such approach means that C89 code continues to work when built with C99 compiler even if it used <code>bool</code>, <code>true</code> or <code>false</code> identifiers in its sown way..<p>Similarly, glibc introduced <a href=//lwn.net/Articles/590381>feature test macros</a>. They all start with an underscore followed by a capital letter. Identifiers in this form are reserved by the C standard thus using them in a program invokes undefined behaviour. Technically speaking, the following is not a well-formed C program:<pre>#define _XOPEN_SOURCE
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
for (int i = 1; i < argc; ++i) {
const double r = atof(argv[i]);
printf("%f\n", M_PI * r * r);
}
}</pre><p>However, glibc <a href=//gnu.org/software/libc/manual/html_node/Feature-Test-Macros.html>documents the behaviour</a> making the program well-defined.<p>It’s worth noting that uClibc and musl libraries handle the cases in much the same way. Microsoft uses the same technique though different macros. To get access to <code>M_PI</code> in particular, a <code>_USE_MATH_DEFINES</code> symbol needs to be defined. Newlib will define symbols conflicting with C standard unless code is compiled in strict mode (e.g. with <code>-std=c99</code> flag). Lastly, Bionic and Diet Libc define the constant unconditionally which strictly speaking means that they don’t conform to the C standard.<p id=f1><a href=#b1>†</a> Yes, I’m aware this link is to a draft. The actual standard is <a href=//webstore.ansi.org/Standards/INCITS/INCITSISOIEC98991999R2005>60 USD from ANSI webstore</a>. Meanwhile, for most practical uses the draft is entirely sufficient. It’s certainly enough for the discussion in this article.Primes ≤ 100 in Rusthttp://mina86.com/2021/06/20/prime-numbers-less-than-a-hundred-in-rust2021-06-20T00:49:43Z2021-06-20T00:49:43ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p>In a past life <a href=/2010/prime-numbers-less-than-a-hundred/ >I’ve talked about</a> a challenge to write the shortest program which prints all prime numbers less than a hundred. Back then I’ve discussed a 60-character long solution written in C. Since Rust is the future, inspired by <a href=//www.reddit.com/r/rust/comments/o1i3d2/prime_number_generator_in_rust_664579_primes/ >a recent thread on Sieve of Eratosthenes</a> I’ve decided to carry the task for Rust as well.<p>To avoid spoiling the solution, I’m padding this article with a bit of unrelated content. To jump straight to the code, skip the next block of paragraphs. Otherwise, here’s a joke for ya:<blockquote><p>After realising he got lost, a man in a hot air balloon spotted a woman below. He descended and shouted, ‘Excuse me, can you help me? I’ve promised a friend I would meet him an hour ago, but I don’t know where I am.’<p>The woman below looked up and replied matter-of-factly, ‘You are in a hot air balloon hovering around ten metres above the ground. You are between 47 and 48 degrees north latitude and between 8 and 9 degrees east longitude.’<p>‘You must be an engineer,’ the balloonist concluded.<p>‘I am,’ the woman replied intrigued, ‘How did you know?’<p>‘Well, everything you told me is technically correct, but I have no idea how to use your information and I am still lost. Frankly, you’ve not been much help.’<p>The woman pondered for a while and responded, ‘You must be in management.’<p>‘I am,’ the man confirmed, ‘but how did you know?’<p>‘Well, you don’t know where you are or where you are going, you have risen to where you are thanks to hot air, you made a promise which you have no idea how to keep and you expect people beneath you to solve your problems. The fact is you are in exactly the same position you were in before we met, but now, somehow, it’s my fault!’</blockquote><h2>The solution</h2><p>Now back to the matter at hand. Let’s first go with a 67-character long solution I came up with. It is as follows:<pre>fn main(){for n in 2..99{if(2..n).all(|k|n%k!=0){println!("{n}")}}}</pre><p>For comparison, here’s the aforementioned C variant:<pre>main(i,j){for(;j=++i<99;j<i||printf("%d\n",i))for(;i%++j;);}</pre><p>Let’s break it down a little taking this opportunity to talk about Rust.<p>Commonality between the two variants is lack of type declarations. It’s important to note that, while in C this was due to (since deprecated) rule that variables are implicitly integers, Rust performs type inference. In many situations in Rust there’s no need to declare types and the compiler will figure out the correct ones.<p>Rust doesn’t have a C-style <code>for</code> syntax and offers range loop instead. <code>for n in 2..99 { <var>body</var> }</code> will execute body with <code>n</code> variable ranging from 2 to 98 inclusively. Since 99 is not a prime, we don’t need to include it in the range. By the way, <code>2..99</code> is not part of the syntax for the loop; rather, it declares a range object. And yes, ranges are right-open (though there’s also syntax for closed intervals).<p><code>|<var>args</var>| <var>expr</var></code> is Rust’s syntax for lambdas (also known as anonymous functions). I’m not a fan of the pipe characters in there — I’d much rather have Haskell’s syntax instead — but it’s something one can get used to.<p>The <code>n % k != 0</code> expression demonstrates Rust doesn’t <a href=/2021/explicit-isnt-better-than-implicit/ >implicitly</a> convert integers to booleans. In fact, the exclamation mark unary operator performs binary (not logical) negation when applied to integer types. That’s something tilde does in C. Tilde used to declare boxed types and values in Rust but is now unused.<p>Perhaps due to quirk of history, ranges in Rust are iterators (as opposed to merely implementing <a href=//doc.rust-lang.org/std/iter/trait.IntoIterator.html><code>IntoIterator</code> trait</a>) which means that methods such as <code>all</code> are available on them. <code>all</code>, of course, checks whether all elements satisfy the predicate given as an argument. This means that <code>(2..n).all(<var>predicate</var>)</code> will test the <var>predicate</var> for all integers from 2 to n-1 inclusively (again, ranges are right-open unless different operator is used).<p>And finally, <a href=//doc.rust-lang.org/std/macro.println.html><code>println!</code></a> is rather self-explanatory. Since <a href=//rust-lang.github.io/rfcs/2795-format-args-implicit-identifiers.html><code>format_args_implicits</code> feature</a> is now stable, the <code>"{n}"</code> syntax can be used to save one character. This is something Python programmers should be familiar with though in Rust the <code>f</code> sigil is not necessary. Programmer needs to know from context when <code>"{n}"</code> means string interpolation and when it’s a plain string literal.<h2>66-character solution?</h2><p>There is a way to reach 66 characters. It’s much more boring and I’m not sure if it’s in the spirit of the challenge. The trick is to hard-code the list of primes as a byte buffer. It’s not pretty, but it works:<pre>fn main(){for n in b"\r%)+/5;=CGIOSYa"{println!("{n}")}}</pre><p>Note that your user agent may fail to display control characters present in the above listing. Copying and pasting should work though.<p>The <code>b</code> sigil in front of the string is necessary to declare a byte-array rather than a <code>str</code> object. The latter cannot be iterated over without invoking <code>bytes</code> or <code>chars</code> method which would make the solution too long.How do balanced audio cables workhttp://mina86.com/2021/06/13/balanced-audio2021-06-13T12:00:11Z2021-06-13T12:00:11ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p>Have you ever wondered how balanced audio cables work? For the longest time I have until finally deciding to look into it. Turns out the principle is actually rather straightforward.<p>In a normal, unbalanced wire an analogue signal <var>S</var> is sent over a pair of wires: one carries the signal while the other a reference zero. Receiver interprets voltage between the two as the signal. The issue is that over the length of a cable noise is introduced. While transmitter sends <var>S</var>, receiver gets <var>S + e</var> (where <var>e</var> denotes the noise).<figure><svg width=45em height=15em viewbox="-10 -10 480 160" fill=none><defs><path id=sin stroke=var(--i) stroke-width=2 d="m0.375,7.3575822c0,0 1.45216,-6.99595999 2.82746,-6.98255999 1.3753,0.0134 2.83797,6.96253999 2.83797,6.96253999 0,0 1.5241,6.9844798 2.86424,6.9891998 1.34014,0.005 2.87126,-6.9891998 2.87126,-6.9891998"/><use href=#sin id=nis transform="scale(1, -1)"/><path id=noise stroke-width=0.4 d="M0.0202051,6.3820138 0.2720171,2.2287188 0.5242361,12.947371 0.7760481,8.6079708e-4 1.0282661,5.9265648l0.251814,0.742079 0.251812,3.9994562 0.252219,-6.0981682 0.251813,6.2265022 0.251812,-6.4269002 0.252219,6.1945212 0.251812,-9.3263952 0.252218,1.87281 0.251814,-1.806416 0.251812,1.590229 0.252219,1.2704 0.251812,-1.312909 0.252219,9.7409552 0.251813,-1.969162 0.251812,-7.5426522 0.252219,7.1677662 0.251813,-0.211733 0.252218,-5.9208462 0.251813,5.85769 0.251812,-4.293778 0.252219,8.2964722 0.251813,-13.1857742 0.251812,3.458179 0.252219,-3.638739 0.251813,7.513908 0.252217,-6.481151 0.251814,5.832185 0.251812,4.7893062 0.252219,-6.8127172 0.251813,1.592656 0.252217,-6.27063 0.251814,9.7490522 0.251813,-4.9791782 0.252217,-0.442495 0.251814,-1.555816 0.2518129,7.3677582 0.252217,-4.1140262 0.251814,2.536349 0.252217,-6.510703 0.251813,2.758608"/></defs><path stroke=var(--e) stroke-width=1 d=M0,126V0H126V126ZM255,126V0h179V126Z stroke-dasharray="1 1"/><text font-style=italic x=6 y=6>Transmitter<tspan x=428 text-anchor=end>Receiver</tspan><tspan font-style=normal font-size=20 x=170 y=57 dy="0 2.5 -2.75 0 2.25" dx="0 1 -0.5 2 -0.75" rotate="5 -7 -13 5 -5">noise</tspan></text><g stroke=currentcolor><path d="m138.75,138h7.5M135,135h15m-18.75,-3h22.5M142.5,104.25V132M392.25001,52.5h12.75v13.500001m-21,-5.250001v11.250069c-126.21168,1.175002-209.78467,0.24377-335.999734,0V33c126.216284,2.5e-5 209.783454,6.9e-5 335.999734,6.9e-5V44.25M22.499997,58.5V52.500069H48.000276M32.999997,104.25H405.00001V90.000001M375.75001,52.5l8.25,-8.25 8.25,8.25-8.25,8.25z"/><g fill=var(--a)><path d=m95.249996,72.000069-26.999998,-15v30zm0,-39-26.999998,-15v30zm229.823004,39-27,-15v30zm0,-39-27,-15v30z /><circle cx=98 cy=72 r=6 /><circle cx=328 cy=72 r=6 /></g><path d=M273.27,22.35h5M273.27,62.85h5M275.77,19.85v5M275.77,60.5v5M351,22.35h5M353.5,19.85v5M351,62.85h5M384,49v7M380.5,52.5h7 /></g><path stroke=var(--f) stroke-width=4 stroke-linecap=round d="m22.499991,63.208315c3.5162,0 6.34694,2.83073 6.34694,6.34693v16.57465c0,3.5162-2.83074,6.34694-6.34694,6.34694-3.5162,0-6.34693,-2.83074-6.34693,-6.34694v-16.57465c0,-3.5162 2.83073,-6.34693 6.34693,-6.34693zm406.521429,-5.342357-14.33261,12.41469h-13.56474v15.97585h13.35921l14.53814,12.59375zM22.499991,97.317001V104.25m-5.40959,0 10.81917,-6e-5m6.367676,-18.697485c2e-6,6.492508-5.284824,11.757544-11.777325,11.764546-6.4925,0.007-11.763346,-5.246675-11.77735,-11.739166M437.93866,69.450978a16.294282,16.294282 0 0 1 0,17.88193m5.93278,-23.81471a25.068125,25.068125 0 0 1 0,29.7475m5.43143,-35.17893a32.421441,32.421441 0 0 1 0,40.61036"/><use href=#sin x=27 y=35 /> <use href=#sin x=108 y=15 /> <use href=#nis x=108 y=70 /> <use href=#sin x=259 y=15 /> <use href=#nis x=259 y=70 /> <use href=#sin x=337 y=15 /> <use href=#sin x=337 y=55 /> <use href=#sin x=337 y=55 /> <use href=#sin x=392 y=35 /><g stroke=var(--q) stroke-dasharray="1, 0.5, 0.75, 1.25"><path stroke-width=0.5 d="m140.40622,48.622739 1.16321,-26.54809 1.15941,68.51422 1.16319,-82.7548196 1.16321,37.8774396 1.15941,4.7434 1.16319,25.56474 1.16321,-38.97981 1.16318,39.80013 1.15941,-41.08109 1.16322,39.59571 1.16317,-59.61482 1.15942,11.97108 1.1632,-11.54669 1.1632,10.16483 1.1594,8.12046 1.16322,-8.39219 1.16318,62.26472 1.16321,-12.58698 1.15942,-48.21304 1.16317,45.81674 1.16322,-1.35342 1.15941,-37.84637 1.16318,37.44269 1.16321,-27.44607 1.15942,53.0315 1.16318,-84.28418 1.16321,22.10488 1.16318,-23.25904 1.15942,48.02931 1.16321,-41.42787 1.16317,37.27965 1.15943,30.6135 1.1632,-43.54726 1.16318,10.18036 1.15942,-40.08223 1.16321,62.31649 1.16318,-31.82719 1.15941,-2.82845 1.16323,-9.94485 1.16316,47.09511 1.16322,-26.29709 1.15942,16.21249 1.16318,-41.61677 1.1632,17.63316 1.15942,-29.83716 1.16319,29.27561 1.1632,-23.94736 1.15942,46.63966 1.16317,-16.23836 1.16323,14.64946 1.16316,-20.94037 1.15943,47.78606 1.16321,-37.81274 1.16318,-2.48686 1.15942,-39.99423 1.1632,41.74617 1.16319,23.13739 1.15941,-23.64977 1.16321,-39.07557 1.16318,20.74887 1.15942,50.45406 1.16318,-38.48036 1.16321,-7.76337 1.16321,20.98434 1.15942,-44.34946 1.16318,41.47962 1.16321,19.96476 1.15939,-48.86774 1.1632,48.85223 1.16321,-23.47123 1.15939,-20.80062 1.1632,42.72693 1.16322,-32.04971 1.16318,2.67575 1.15943,-16.23576 1.1632,-7.99625 1.16318,51.81265 1.15942,8.01696 1.16318,-51.36239 1.1632,-0.0881 1.15942,29.14364 1.16321,8.43619 1.16318,-24.43905 1.16321,-11.81584 1.15942,7.26909 1.16319,-28.4216496 1.16316,35.7683796 1.15946,-22.23167 1.16318,-7.63655"/><use href=#noise x=279.87 y=15.39 /> <use href=#noise x=279.87 y=55.89 /> <use href=#noise x=357.55 y=15.39 /> <use href=#noise x=357.55 y=55.89 /></g></svg><figcaption>Illustration of transmission of an analogue signal over a balanced cable. For brevity the diagram missuses symbols from digital signal processing and should not be taken as a technically correct representation.</figcaption></figure><p>A balanced cable addresses this problem by sending the information over three wires: hot (or positive), cold (or negative) and ground. Hot wire carries the signal <var>S</var> as before, cold one carries the inverse of the signal <var>-S</var> and ground is zero as before. Like before, when information travels over the cable, noise is introduced. Crucially, because it’s a single cable, noise on the positive and negative wires are <em>strongly</em> correlated. Receiver therefore gets <var>S + e</var> on hot wire and <var>-S + e</var> on cold wire. All it needs to do is inverse the signal on negative wire and add both signals together. Inversion changes phase of the noises on the cold wire such that it cancels out error remaining on the positive wire: <var>(S + e) + -(-S + e) = S + e + S - e → S</var>.Explicit isn’t better than implicithttp://mina86.com/2021/06/06/explicit-isnt-better-than-implicit2021-06-06T20:12:49Z2021-06-06T20:12:49ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p>Continuing <a href=/2021/embrace-the-bloat/ >the new tradition of clickbaity titles</a>, let’s talk about explicitness. It’s a subject that comes up when bike-shedding language and API designs. Pointing out that a construct or a function exhibits implicit behaviour is often taunted as an ultimate winning argument against it.<p>There are two problems with such line of reasoning. First of all, people claim to care about feature being explicit but came to accept a lot of implicit behaviour without batting an eye. Second of all, no one actually agrees what the terms mean.<p>In this article I’ll demonstrate those two issues and show that ‘explicit over implicit’ is the wrong value to uphold. It’s merely a proxy for a much more useful goal interfaces should strive for. By the end I’ll demonstrate what we should look at instead.<h2>Dispelling the myths</h2><p>Let’s start by examining just how explicit Python and Rust are. After all, their communities often boast the virtue of explicitness in their respective languages. I’m sure it’s going to be completely uncontroversial to suggest that they may be much more implicit than some give them credit for.<h3>The Zen of Python is a lie</h3><p><a href=//www.python.org/dev/peps/pep-0020/ >The Zen of Python</a> is a collection of aphorisms which represent Python’s guiding principles. It’s not clear, at least to me, whether they are listed in order of importance, but the second entry states that ‘explicit is better than implicit’. Despite that, there are multiple instances where this rule is broken. In particular, Python implicitly:<ul><li>creates new variables. One cannot even argue that the assignment statement defines a variable since that’s not the case as can be seen in the following toy example:<pre>foo = 'foo'
def func():
return foo
foo = 'bar'
print(func())</pre><li>propagates exceptions turning any statement into a possible function exit point;<li>converts values to booleans in conditions. For example, <code>if some_list:</code> is a Pythonic way to check if a list is non-empty while <code>if not var:</code> is a Pythonic way of checking whether a variable is <code>None</code>;<li>converts between booleans, integers and floats in arithmetic operations. Depending on the operation, this happens even if both operands are of the same type;<li>concatenates strings separated by white-space;<li>constructs tuples from comma-separated values (without the need to type parenthesise);<li>loads package’s <code>__init__.py</code> file when importing a module; and<li>implicitly returns <code>None</code> from functions lacking explicit return.</ul><p>It could even be argued that garbage collection is an implicit behaviour. After all, objects are never explicitly freed and all memory management is hidden from the user.<h3>Rust philosophy of explicitness is not a thing</h3><p>But let’s not dwell on scripting languages, change gears and go a level lower to a compiled and strictly-typed Rust. While it doesn’t have a formal list of guiding principles, explicitness is often cited as an important value. Yet, what’s often forgotten is that Rust implicitly:<ul><li>infers types,<li>infers lifetime in function prototypes,<li>shortens lifetimes of references,<li>performs <code>Deref</code> coercion,<li>passes <code>self</code> by value or reference based on method prototype,<li>clones <code>Copy</code> types when passed to functions by value,<li>calls <code>drop</code> of <code>Drop</code> objects when they go out of scope,<li>converts error type when question mark operator is used,<li>implicitly returns <code>()</code> from functions lacking tail or return expression and<li>resolves `format!` named arguments not explicitly listed in invocation of the macro.</ul><p>To be even more contrarian, I could claim <code>sort</code> method <em>implicitly</em> uses natural ordering. Or that order of operations in <code>230 - 220 / 2</code> expression isn’t explicitly specified.<p>But the point here isn’t to demonstrate that Python or Rust aren’t ‘explicit’. Rather, it is to show that even languages which seemingly champion explicitness compromise on that principle. This means that saying some new feature exhibits implicit behaviour is not a be-all and end-all argument for blocking such design.<h2>‘You’ve typed it so it’s explicit’</h2><p>Than again, maybe I’m completely off the mark? Perhaps all the aforementioned behaviour aren’t implicit after all? For example, Python documents <a href=//docs.python.org/3/reference/expressions.html#booleans>quite clearly</a> what values are interpreted as false and which as true. This means that there is nothing implicit in <code>if some_list:</code> not executing the body if the list is empty, right?<p>Some of the examples I’ve enumerated are definitely less clear-cut than others, but I challenge anyone who thinks that none of them are valid to come up with justification and then present an example of a feature of any non-esoteric language which is implicit. One will quickly realise that either at least some of the aforementioned behaviours are implicit or there’s no such thing as an implicit behaviour and thus the whole discussion is moot.<p>Ultimately this leaves us with no commonly agreed definition of what it means for a feature or interface to be explicit. There’s not even a consensus on some vague understanding of the phrase. On one extreme, a not unreasonable argument that nothing is implicit could be made (after all a program behaves exactly according to language’s documentation), on the other, some C♯ programmers argue that <code>""</code> isn’t an explicit-enough way of specifying an empty string. Unfortunately, I don’t have a definition which would satisfy everyone and thus solve this particular problem. Instead, I’m side-stepping the entire discussion.<h2>Explicit doesn’t matter</h2><p>Because, you see, when people say ‘design X is bad because it’s not explicit’ they actually don’t care about the feature being implicit. Instead, they (potentially subconsciously) use the level of explicitness as a proxy to decide how easy it is to reason about the program.<p>To continue with Python’s boolean coercion example, it is well understood that Python considers zero values and empty containers to be false. Therefore there’s little issue with <code>if some_list:</code> checking whether the list is empty. However, time of day is neither a collection nor a number so <code>if some_time:</code> <a href=//lwn.net/Articles/590299/ >checking whether object represents midnight</a> <a href=//github.com/python/cpython/commit/ee6bdc07d65d5df418ad9dc2bb7139666af9a6b2>was</a> an issue. It wasn’t because the check was implicit (this was also true when testing for empty container) but rather because it was an unexpected behaviour.<p id=b1>When Rust infers types, the compiler picks the only one sensible and obvious choice. If there’s any ambiguity, programmer has to explicitly specify the type. Contrast it with function overloading in C++. The rules are defined, but they are so convoluted only a handful of people understand them. Again, the issue isn’t whether the feature is implicit or not; the problem is how easy it is to reason about the code and how likely it is that the compiler does something unexpected.<a href=#f1>†</a><h3>Principle of least astonishment</h3><p>What actually matters is following the principle of least astonishment. Rather than wondering whether a particular design is explicit enough, the correct question is to ask how likely it is for a feature to lead to a surprising behaviour.<p>Bugs often emerge when the compiler does something programmer doesn’t expect. Python treating <code>True</code> as one in arithmetic operations is the only sensible non-throwing interpretation which means that the principle is preserved. On the other hand C promoting integers can easily lead to astonishing results (such as unsigned object being less than negative value of a signed type) which violates the principle.<h3>Summary</h3><p>In conclusion, advocating explicitness for explicitness’s sake is not sound. Being explicit is a tool which, in some cases, helps minimise surprises in the code and makes it easier to reason about a program. If implicit behaviour does not make the source more confusing that it normally would be, there’s no reason to fight it.<p>But even then, all those things need to be weighted in context of other useful properties of a design. Ergonomics of a programming language matter and it may sometimes be worth sacrificing the principle of least astonishment if that means the code may be more beautiful (which coincidentally is another aphorism from The Zen of Python).<p id=f1><a href=#b1>†</a> As an aside, confusing a general idea of a feature with implementation of the feature in a specific language is another common fallacy. Function resolution in C++ may be convoluted but that doesn’t mean that function overloading in general needs to be confusing. For example, picking methods in Java is quite straightforward and allowing overloading on <a href=//en.wikipedia.org/wiki/Arity>arity</a> in Rust would leave no ambiguity in the code. Arguing against default parameters on the account of how mystifying C++ can get is therefore invalid. Similar comparison could be made with Python’s treatment of false values. Yes, it may sometimes lead to astonishing results (e.g. intending to test for <code>None</code> but forgetting that non-<code>None</code> values are tested as well), but if adapted to Rust, those surprising behaviours would not be an issue (thanks to strict typing).Programmer (vs) Dvorakhttp://mina86.com/2021/05/30/programmer-dvorak2021-05-30T11:36:23Z2021-05-30T11:36:23ZMichał ‘mina86’ Nazarewiczhttps://mina86.com<p><small>Update: The article was updated in October 2021 to include direct comparison shift usage between Dvorak and Programmer Dvorak layouts.</small><p>A few years age I’ve made a decision that had the potential to change the course of history. Had I went a different path, the <a href="//bugs.freedesktop.org/show_bug.cgi?id=25200">pl(dvp)</a> layout might have never seen the light of day. But did I make a wise choice? Or had I chosen poorly?<p>I’m talking of course about the decision to learn <a href=//www.kaufmann.no/roland/dvorak/ >Programmer Dvorak</a> rather than a regular Dvorak keyboard layout. The main differences between the two is that in the former digits are entered with Shift key pressed down which allows several punctuation marks often used when programming to be typed without the need to reach for Shift. The hypothesis goes that developers use digits less often thus such design optimises the layout for them.<p>To test this I’ve grabbed <a href=//github.com/mina86>all my git repositories</a> and constructed a histogram of characters used in text files present there. Since letters are on the same position on both layouts in question, only digits and punctuation characters are compared on the histogram:<figure><svg xmlns=http://www.w3.org/2000/svg version=1.1 width=44em height=17em viewbox="0 0 704 272" stroke-width=1><defs> <pattern id=un patternunits=userSpaceOnUse width=4 height=4><rect width=4 height=4 fill=#1b9e77 /><path d=M-1,1l2,-2M0,4l4,-4M3,5l2,-2 stroke=#000 /></pattern> <pattern id=sh patternunits=userSpaceOnUse width=4 height=4><rect width=4 height=4 fill=#d95f02 /><path d=M-1,-1l2,2M0,0l4,4M3,3l2,2 stroke=#000 /></pattern> </defs><g id=gr><path stroke=var(--e) d=M8,32h688m0,40H8m0,40h688m0,40H8m0,40h688 /><path fill=#fffc d=M438,32h246v80H438Z /><text style=fill:#212121><tspan x=470 y=55>Not number row</tspan><tspan x=470 y=79>Unshifted (number row)</tspan><tspan x=470 y=103>Shifted (number row)</tspan></text></g><path fill=url(#un) d=M64,192v-125h14v125zm16,0v-124h14v124zm48,0v-111h14v111zm16,0v-110h14v110zm160,0v-50h14v50zm16,0v-41h14v41zm16,0v-41h14v41zm128,0v-26h14v26zm16,0v-24h14v24zm32,0v-24h14v24zm32,0v-14h14v14zm16,0v-10h14v10zM446,64v16h16V64zM16,224h360v14H16z /><path fill=url(#sh) d=M192,192v-89h14v89zm64,0v-66h14v66zm16,0v-55h14v55zm80,0v-37h14v37zm16,0v-35h14v35zm16,0v-33h14v33zm32,0v-32h14v32zm16,0v-31h14v31zm16,0v-27h14v27zm48,0v-24h14v24zm80,0v-10h14v10zm48,0v-6h14v6zm32,0v-3h14v3zM446,88v16h16V88zM16,240h230v14H16z /><path fill=#7570b3 d=M16,192v-160h14v160zm16,0v-151h14v151zm16,0v-148h14v148zm48,0v-114h14v114zm16,0v-112h14v112zm48,0v-99h14v99zm16,0v-95h14v95zm32,0v-84h14v84zm16,0v-84h14v84zm16,0v-73h14v73zm48,0v-53h14v53zm112,0v-32h14v32zm128,0v-18h14v18zm64,0v-9h14v9zm16,0v-8h14v8zm32,0v-4h14v4zm32,0v-2h14v2zM446,40v16h16V40zM16,208h640v14H16z /><text font-size=0.875em text-anchor=middle><tspan x=23 y=24>-</tspan><tspan x=39 y=33>"</tspan><tspan x=55 y=36>.</tspan><tspan x=71 y=59>)</tspan><tspan x=87 y=60>(</tspan><tspan x=103 y=70>,</tspan><tspan x=119 y=72>/</tspan><tspan x=135 y=73>*</tspan><tspan x=151 y=74>=</tspan><tspan x=167 y=85>_</tspan><tspan x=183 y=89>;</tspan><tspan x=199 y=95>0</tspan><tspan x=215 y=100>></tspan><tspan x=231 y=100>:</tspan><tspan x=247 y=111><</tspan><tspan x=263 y=118>1</tspan><tspan x=279 y=129>2</tspan><tspan x=295 y=131>'</tspan><tspan x=311 y=134>#</tspan><tspan x=327 y=143>{</tspan><tspan x=343 y=143>}</tspan><tspan x=359 y=147>4</tspan><tspan x=375 y=149>3</tspan><tspan x=391 y=151>8</tspan><tspan x=407 y=152>\</tspan><tspan x=423 y=152>5</tspan><tspan x=439 y=153>6</tspan><tspan x=455 y=157>9</tspan><tspan x=471 y=158>$</tspan><tspan x=487 y=160>[</tspan><tspan x=503 y=160>7</tspan><tspan x=519 y=160>]</tspan><tspan x=535 y=166>&</tspan><tspan x=551 y=170>+</tspan><tspan x=567 y=174>!</tspan><tspan x=583 y=174>%</tspan><tspan x=599 y=175>|</tspan><tspan x=615 y=176>@</tspan><tspan x=631 y=178>`</tspan><tspan x=647 y=180>?</tspan><tspan x=663 y=181>~</tspan><tspan x=679 y=182>^</tspan><tspan x=672 y=219>52%</tspan><tspan x=392 y=235>29%</tspan><tspan x=262 y=251>19%</tspan></text></svg><figcaption>Fig. 1. Histogram of characters used in text files authored by me present in my Git repositories.</figcaption></figure><h2>Analysis</h2><p>The graph supports the idea that punctuation is used more often than digits when programming. Keys outside of the number row are for the most part on the same position on regular Dvorak and Programmer Dvorak layouts so shifted and unshifted characters are of main interest. Of those, the first four are non-digits and within the first ten only three are digits.<p>A quite striking feature of Programmer Dvorak is that it puts digits in a ‘7531902468’ sequence (rather than ordering them). Again, the histogram supports that design decision as well. The two most used digits are zero and one which are accessible with index finger. Traditional sorted ordering puts them in little finger’s column.<p>A minor difference between the two Dvorak variants is position of the apostrophe and semicolon keys. The data shows that I use double quote quite often. Since I’ve never learned to properly use Shift keys, location used by Programmer Dvorak ends up beneficial for me. After all, pressing Shift with the key right next to it is easier than Shift and the key two rows up.<p id=b1>Lastly, Programmer Dvorak moves plus and equals signs to the number row. On regular Dvorak layout those keys are accessible by stretching right little finger one row up and two columns to the side. It’s hardly a comfortable key to reach. Programmer Dvorak puts caret and at sign there instead. And yes, you’ve guessed it, the data supports that decision: equals and plus are used much more than at sign and caret.<a href=#f1>†</a><h3>Direct comparison to Dvorak</h3><p>While Dvorak and Programmer Dvorak share most of the layout outside of the number row, there are some differences. Most notably, the equals sign is present in both layouts on unshifted positions. On top row in Dvorak and number row in Programmer Dvorak. Looking at frequencies of number row keys only isn’t enough to evaluate the two layouts.<figure><svg xmlns=http://www.w3.org/2000/svg version=1.1 width=44em height=6em viewbox="0 0 704 96" stroke-width=1><path fill=url(#un) d=M0,29h384v16H0ZM0,75h423v16H0Z /><path fill=url(#sh) d=M384,29h320v16H384ZM423,75h281v16H423Z /><text x=8 y=25>Dvorak <tspan font-size=0.875em> <tspan x=380 text-anchor=end>55% unshifted</tspan> <tspan x=388>45% shifted</tspan> </tspan></text><text x=8 y=71>Programmer Dvorak <tspan font-size=0.875em> <tspan x=419 text-anchor=end>60% unshifted</tspan> <tspan x=427>40% shifted</tspan> </tspan></text></svg><figcaption>Fig. 2. Frequencies of shifted and unshifted non-letter characters on Dvorak and Programmer Dvorak layouts.</figcaption></figure><p>Collating data for all non-letter characters shows that Programmer Dvorak has a slight advantage with 60% of the typed characters being on unshifted positions on that layout compared to 55% on Dvorak. Big win for the former are parenthesise and the asterisk while the latter catches up a little with the first three natural numbers.<h3>Language composition</h3><p>Different languages use their own syntax and thus the most commonly used characters end up different. Someone writing exclusively in Lisp will wear out their parenthesise much quicker than someone writing in Haskell. Turns out, I write mostly in languages whose syntax is based on C.<table><thead><tr><th>File type<th>Percentage of lines<tbody><tr style=background:var(--j)><td>C or C++<td class=r>41.53%<tr style="background:linear-gradient(to right,var(--j) 22%,#fff0 22%)"><td>HTML or XML-based<td class=r>8.98%<tr style="background:linear-gradient(to right,var(--j) 19%,#fff0 19%)"><td>Java<td class=r>8.02%<tr style="background:linear-gradient(to right,var(--j) 17%,#fff0 17%)"><td>Misc configuration<td class=r>7.20%<tr style="background:linear-gradient(to right,var(--j) 16%,#fff0 16%)"><td>Python<td class=r>6.64%<tr style="background:linear-gradient(to right,var(--j) 14%,#fff0 14%)"><td>Shell script<td class=r>5.65%<tr style="background:linear-gradient(to right,var(--j) 9%,#fff0 9%)"><td>Rust<td class=r>3.89%<tr style="background:linear-gradient(to right,var(--j) 9%,#fff0 9%)"><td>Perl<td class=r>3.89%<tr style="background:linear-gradient(to right,var(--j) 8%,#fff0 8%)"><td>Misc text<td class=r>3.33%<tr style="background:linear-gradient(to right,var(--j) 7%,#fff0 7%)"><td>JavaScript<td class=r>2.94%<tr style="background:linear-gradient(to right,var(--j) 6%,#fff0 6%)"><td>Lisp<td class=r>2.55%<tr style="background:linear-gradient(to right,var(--j) 6%,#fff0 6%)"><td>LaTeX<td class=r>2.33%<tr style="background:linear-gradient(to right,var(--j) 3%,#fff0 3%)"><td>Makefile<td class=r>1.34%<tr style="background:linear-gradient(to right,var(--j) 4%,#fff0 4%)"><td>(Rest)<td class=r>1.71%</table><p>But I also dabble in Lisp. This brings up a question: is that why I write so many parenthesise? Analysing all files except for <code>*.el</code> reveal that my using Emacs doesn’t skew the result too much. There is a little bit of shuffling but parenthesise still end up in top five (even though their relative frequency suffers) and first digit (zero) on 12th position.<figure><svg xmlns=http://www.w3.org/2000/svg version=1.1 width=44em height=12em viewbox="0 0 704 192" stroke-width=1><use href=#gr /><path fill=url(#un) d=M64,192v-127h14v127zm16,0v-125h14v125zm48,0v-119h14v119zm16,0v-119h14v119zm160,0v-54h14v54zm16,0v-45h14v45zm16,0v-44h14v44zm128,0v-28h14v28zm32,0v-26h14v26zm16,0v-25h14v25zm32,0v-15h14v15zm16,0v-11h14v11zM446,64v16h16V64z /><path fill=url(#sh) d=M192,192v-96h14v96zm64,0v-72h14v72zm16,0v-61h14v61zm80,0v-41h14v41zm16,0v-39h14v39zm16,0v-36h14v36zm16,0v-35h14v35zm16,0v-34h14v34zm32,0v-29h14v29zm32,0v-27h14v27zm96,0v-11h14v11zm48,0v-7h14v7zm32,0v-3h14v3zM446,88v16h16V88z /><path fill=#7570b3 d=M16,192v-160h14v160zm16,0v-159h14v159zm16,0v-159h14v159zm48,0v-125h14v125zm16,0v-123h14v123zm48,0v-108h14v108zm16,0v-100h14v100zm32,0v-91h14v91zm16,0v-90h14v90zm16,0v-79h14v79zm48,0v-57h14v57zm144,0v-33h14v33zm96,0v-19h14v19zm64,0v-10h14v10zm16,0v-8h14v8zm32,0v-4h14v4zm32,0v-2h14v2zM446,40v16h16V40z /><text font-size=0.875em text-anchor=middle><tspan x=23 y=24>.</tspan><tspan x=39 y=25>"</tspan><tspan x=55 y=25>-</tspan><tspan x=71 y=57>)</tspan><tspan x=87 y=59>(</tspan><tspan x=103 y=59>/</tspan><tspan x=119 y=61>,</tspan><tspan x=135 y=65>*</tspan><tspan x=151 y=65>=</tspan><tspan x=167 y=76>_</tspan><tspan x=183 y=84>;</tspan><tspan x=199 y=88>0</tspan><tspan x=215 y=93>></tspan><tspan x=231 y=94>:</tspan><tspan x=247 y=105><</tspan><tspan x=263 y=112>1</tspan><tspan x=279 y=123>2</tspan><tspan x=295 y=127>'</tspan><tspan x=311 y=130>#</tspan><tspan x=327 y=139>{</tspan><tspan x=343 y=140>}</tspan><tspan x=359 y=143>4</tspan><tspan x=375 y=145>3</tspan><tspan x=391 y=148>8</tspan><tspan x=407 y=149>5</tspan><tspan x=423 y=150>6</tspan><tspan x=439 y=151>\</tspan><tspan x=455 y=155>9</tspan><tspan x=471 y=156>$</tspan><tspan x=487 y=157>7</tspan><tspan x=503 y=158>[</tspan><tspan x=519 y=159>]</tspan><tspan x=535 y=165>&</tspan><tspan x=551 y=169>+</tspan><tspan x=567 y=173>!</tspan><tspan x=583 y=173>%</tspan><tspan x=599 y=174>|</tspan><tspan x=615 y=176>@</tspan><tspan x=631 y=177>`</tspan><tspan x=647 y=180>?</tspan><tspan x=663 y=181>~</tspan><tspan x=679 y=182>^</tspan></text></svg><figcaption>Fig. 3. Histogram of characters used in text files authored by me present in my Git repositories excluding Emacs Lisp files.</figcaption></figure><h2>Conclusion</h2><p>Looks like my choice was correct. It has been scientifically proven that Programmer Dvorak is better than regular Dvorak. Who knows what catastrophes were averted thanks to me!<p>Except of course this was hardly scientific and the results should be taken with a grain of salt. After all, looking at the characters in source files doesn’t necessary reflect what keys are pressed to construct those files. For example, while I don’t configure my editor to automatically insert parenthesise, braces or apostrophes, if I had such feature enabled, methodology chosen here would overestimate use of punctuation characters. Looking at the files also ignores effects of copying and pasting text. It’s hard to tell whether that would affect the data though or whether the effect balances itself out.<p>Overall though, I’m fairly satisfied with the results demonstrating that for me Programmer Dvorak was a better choice than regular Dvorak layout.<h2>Addendum: Qwerty</h2><p>Since inevitably someone will be curious about comparison to Qwerty layout, the statistics for the letters are as follows:<table><thead><tr><th>Letter<th>Frequency<th>Dvorak<th>Qwerty<tbody><tr style=background:var(--j)><th>e<td class=r>11.5%<td>home<td>top<tr style="background:linear-gradient(to right,var(--j) 79%,#fff0 79%)"><th>t<td class=r>9.1%<td>home<td>top*<tr style="background:linear-gradient(to right,var(--j) 65%,#fff0 65%)"><th>i<td class=r>7.5%<td>home*<td>top<tr style="background:linear-gradient(to right,var(--j) 62%,#fff0 62%)"><th>a<td class=r>7.2%<td>home<td>home<tr style="background:linear-gradient(to right,var(--j) 61%,#fff0 61%)"><th>n<td class=r>7.0%<td>home<td>bottom*<tr style="background:linear-gradient(to right,var(--j) 59%,#fff0 59%)"><th>s<td class=r>6.8%<td>home<td>home<tr style="background:linear-gradient(to right,var(--j) 59%,#fff0 59%)"><th>r<td class=r>6.7%<td>top<td>top<tr style="background:linear-gradient(to right,var(--j) 56%,#fff0 56%)"><th>o<td class=r>6.5%<td>home<td>top<tr style="background:linear-gradient(to right,var(--j) 40%,#fff0 40%)"><th>l<td class=r>4.6%<td>top<td>home<tr style="background:linear-gradient(to right,var(--j) 37%,#fff0 37%)"><th>c<td class=r>4.3%<td>top<td>bottom<tbody><tr style="background:linear-gradient(to right,var(--j) 32%,#fff0 32%)"><th>d<td class=r>3.7%<td>home*<td>home<tr style="background:linear-gradient(to right,var(--j) 27%,#fff0 27%)"><th>p<td class=r>3.1%<td>top<td>top<tr style="background:linear-gradient(to right,var(--j) 27%,#fff0 27%)"><th>u<td class=r>3.1%<td>home<td>top<tr style="background:linear-gradient(to right,var(--j) 26%,#fff0 26%)"><th>m<td class=r>2.9%<td>bottom<td>bottom<tr style="background:linear-gradient(to right,var(--j) 25%,#fff0 25%)"><th>h<td class=r>2.9%<td>home<td>home*<tr style="background:linear-gradient(to right,var(--j) 24%,#fff0 24%)"><th>f<td class=r>2.8%<td>top*<td>home<tr style="background:linear-gradient(to right,var(--j) 18%,#fff0 18%)"><th>g<td class=r>2.1%<td>top<td>home*<tr style="background:linear-gradient(to right,var(--j) 14%,#fff0 14%)"><th>b<td class=r>1.6%<td>bottom*<td>bottom*<tr style="background:linear-gradient(to right,var(--j) 14%,#fff0 14%)"><th>y<td class=r>1.6%<td>top*<td>top*<tr style="background:linear-gradient(to right,var(--j) 12%,#fff0 12%)"><th>w<td class=r>1.3%<td>bottom<td>top<tbody><tr style="background:linear-gradient(to right,var(--j) 9%,#fff0 9%)"><th>v<td class=r>1.0%<td>bottom<td>bottom<tr style="background:linear-gradient(to right,var(--j) 8%,#fff0 8%)"><th>k<td class=r>1.0%<td>bottom<td>home<tr style="background:linear-gradient(to right,var(--j) 8%,#fff0 8%)"><th>x<td class=r>0.9%<td>bottom*<td>bottom<tr style="background:linear-gradient(to right,var(--j) 4%,#fff0 4%)"><th>z<td class=r>0.4%<td>bottom<td>bottom<tr style="background:linear-gradient(to right,var(--j) 2%,#fff0 2%)"><th>j<td class=r>0.3%<td>bottom<td>home<tr style="background:linear-gradient(to right,var(--j) 2%,#fff0 2%)"><th>q<td class=r>0.2%<td>bottom<td>top</table><p>Asterisk indicates keys towards the middle of the keyboard which index finger needs to stretch sideways to reach.<p>This also gives some credibility to the idea of swapping ‘i’ and ‘u’ keys on Dvorak-based layouts. At least in the source code I write, the former is used over twice as frequent as the latter. On the other hand, as <a href=//www.reddit.com/r/dvorak/comments/uonpi6/comment/i8gsuh0/ >The Temp has suggested</a>, a reason to prefer keeping the two keys as they are might be bigrams: ‘“ou” is very common, and it would be a strain to enter that sequence if “u” and “i” were swapped.’<p id=f1><a href=#b1>†</a> Then again, I usually type on Kinesis Advantage keyboard with <a href=//github.com/mina86/dot-files/tree/master/kinesis>a small dose of remapping</a>. This puts at sign under my middle finger at the bottom of the keyboard. As a result, it’s slightly easier to type than equals sign which requires my index finger to be stretched far up and to the right.