Go: string[]byte

Posted by Michał ‘mina86’ Nazarewicz on 28th of February 2017

Yes… I’ve started coding in Go recently. It lacks many things but the one feature relevant to this post is const keyword. Arrays and slices in particular are always mutable and so equivalent of C’s const char * does not exist.

On the other hand, strings are immutable which means that conversion between a string and []byte requires memory allocation and copying of the data. Often this might be acceptable but to squeeze every last cycle the following two functions might help achieve zero-copy implementation:

func String(bytes []byte) string {
	hdr := *(*reflect.SliceHeader)(unsafe.Pointer(&bytes))
	return *(*string)(unsafe.Pointer(&reflect.StringHeader{
		Data: hdr.Data,
		Len:  hdr.Len,
	}))
}

func Bytes(str string) []byte {
	hdr := *(*reflect.StringHeader)(unsafe.Pointer(&str))
	return *(*[]byte)(unsafe.Pointer(&reflect.SliceHeader{
		Data: hdr.Data,
		Len:  hdr.Len,
		Cap:  hdr.Len,
	}))
}

Depending on the length of the strings, the difference in performance might be noticeable:

PSA: Creating world-unreadable files

Posted by Michał ‘mina86’ Nazarewicz on 5th of February 2017

I’ve been reading tutorials on using key-files for disk encryption. Common approach for generating such files is to create it using something similar to head -c 4096 /dev/urandom >key-file and only then change it’s permissions (usually with a plain chmod 400 key-file) to prevent others from reading it.

Please, stop doing this and spreading that method. The correct way of achieving the effect is:

(umask 077; head -c 64 /dev/random >key-file)

Or if the file needs to be created as root while command is run by a different user:

sudo sh -c 'umask 077; head -c 64 /dev/random >key-file'

The first method creates the file as world-readable and before its permission are changed anyone can read it. The second method creates the file as readable only by its owner from the beginning thus preventing the secret disclosure.

Generating random reals

Posted by Michał ‘mina86’ Nazarewicz on 26th of December 2016

A well known way of generating random floating-point numbers in the presence of a pseudo-random number generator (PRNG) is to divide output of the latter by one plus its maximum possible return value.

extern uint64_t random_uint64(void);

double random_double(void) {
	return random_uint64() / (UINT64_MAX + 1.0);
}

This method is simple, effective, inefficient and wrong on a few levels.

Strach

Posted by Michał ‘mina86’ Nazarewicz on 28th of September 2016

English version available on The Codeless Code.

Niedawno przyjęty do świątyni mnich zbliżył się do mistrza.

― Otrzymałem zadanie dodania kilku nowych funkcji do systemu obsługi zamówień Cesarskiego Szewca, ale nie jestem w stanie zrozumieć, jak on działa. Logika jest rozproszona pomiędzy wiele aplikacji zaimplementowanych przy użyciu najróżniejszych technologii. Zamiast stworzyć wspólne biblioteki, autorzy najzwyklej skopiowali fragmenty kodu pomiędzy różnymi miejscami, często wprowadzając subtelne rozbieżności. Zadania pracujące w tle wyszukują i modyfikują rekordy w bazie danych bez żadnego udokumentowanego powodu. Sama baza danych wydaje się spiskować przeciwko mnie: prosta modyfikacja jednej tabeli może wyzwolić kaskadę zmian w wielu innych.

Python tips and tricks

Posted by Michał ‘mina86’ Nazarewicz on 1st of September 2016

Python! My old nemesis, we meet again. Actually, we meet all the time, but despite that there are always things which I cannot quite remember how to do and need to look them up. To help with the searching, here there are collected in one post:

On Unicode

Posted by Michał ‘mina86’ Nazarewicz on 25th of October 2015

There are a lot of misconceptions about Unicode. Most are there because people assume what they know about ASCII or ISO-8859-* is true about Unicode. They are usually harmless but they tend to creep into minds of people who work with text which leads to badly designed software and technical decisions made based on false information.

Without further ado, here’s a few facts about Unicode that might surprise you.

Bash right prompt

Posted by Michał ‘mina86’ Nazarewicz on 28th of September 2015

There are multiple ways to customise Bash prompt. There’s no need to look for long to find plethora of examples with fancy, colourful PS1s. What have been a bit problematic is having text on the right of the input line. In this article I’ll try to address that shortcoming.

Getting text on the right

The typical approach is using PROMPT_COMMAND to output desired content. The variable specifies a shell code Bash executes prior to rendering the primary prompt (i.e. PS1).

The idea is to align text to the right and then using carrier return move the cursor back to the beginning of the line where Bash will start rendering its prompt. Let’s look at an example of showing time in various locations:

__command_rprompt() {
	local times= n=$COLUMNS tz
	for tz in ZRH:Europe/Zurich PIT:US/Eastern \
	          MTV:US/Pacific TOK:Asia/Tokyo; do
		[ $n -gt 40 ] || break
		times="$times ${tz%%:*}\e[30;1m:\e[0;36;1m"
		times="$times$(TZ=${tz#*:} date +%H:%M)\e[0m"
		n=$(( $n - 10 ))
	done
	[ -z "$times" ] || printf "%${n}s$times\\r" ''
}
PROMPT_COMMAND=__command_rprompt
Terminal window presenting right prompt behaviour.

Clearing the line on execution

It has one annoying issue. The right text reminds on screen even after executing a command. Typically this is a matter of aesthetic but it also makes copying and pasting session history more convoluted.

A manual solution is to use redraw-current-line readline function (e.g. often bound to C-l). It clears the line and prints the prompt and whatever input has been entered thus far. PROMPT_COMMAND is not executed so the right text does not reappear.

Lack of automation can be addressed with a tiny bit of readline magic and a ~/.inputrc file which deserves much more fame than what it usually gets.

Tricky part is bindind C-m and C-j to two readline functions, redraw-current-line followed by accept-line, which is normally not possible. This limitation can be overcome by binding the key sequences to a different sequence which will be interpreted recursively.

To test that idea it’s enough to execute:

bind '\C-l:redraw-current-line'
bind '\M-\C-j:accept-line'
bind '\C-j:"\C-l\M-\C-j"' '\C-m:"\C-j"'

Making this permanent is as easy as adding the following lines to ~/.inputrc:

$if Bash
    "\C-l": redraw-current-line
    "\e\C-j": accept-line
    "\C-j": "\C-l\e\C-j"
    "\C-m": "\C-l\e\C-j"
$endif

With that, the right prompt will disappear as soon as the shell command is executed. (Note the use of \M- in bind command vs. \e in ~/.inputrc file).

The time has come to stand up for the GPL

Posted by Michał ‘mina86’ Nazarewicz on 11th of March 2015

For people who know me it should come with no surprise that I support free software in most forms it can take. I also believe that if someone gives you something at zero price, basic courtesy dictates that you follow wishes of that person. This is why when Software Freedom Conservancy started a GPL Compliance Project for Linux Developers I didn’t hesitate even for a minute to offer little Linux copyright I held to help the effort.

Most importantly though, it is why I fully support Conservancy in taking legal action against VMware which for years has been out of compliance with Linux’s license.

If you care about free software, the GPL or want more projects like OpenWrt, consider donating to help Christoph Hellwig and the Conservancy with their legal battle against this multi-billion-dollar corporation who for some reason decided to free-ride on other people’s work without respecting their wishes.

If you can’t or don’t want to donate, twitting something along the lines of ‘Play by the rules, @VMware. I defend the #GPL with Christoph & @Conservancy. #DTRTvmware Help at https://sfconservancy.org/supporter/’ or otherwise spreading the word will help as well. Oh, and in case you were, like I was, wondering — DTRT stands for ‘do the right thing’.

And if you want to know more:

Miscellaneous tips and tricks

Posted by Michał ‘mina86’ Nazarewicz on 14th of December 2014

Don’t you hate when you need to do something you had done before, but cannot remember how exactly? I’ve been in that situation several times and sometimes looking up for a correct method turned out considerably harder than it should. To alleviate the need for future Googling, here’s a bag of notes I can reference easily:

Looking for Python stuff? Those are now in separate post:

Map-reduce explained

Posted by Michał ‘mina86’ Nazarewicz on 18th of May 2014

Outside of functional programming context, map-reduce refers to a technique for processing data. Thanks to properties of map and reduce operations, computations which can be expressed using them can be highly parallelised, which allows for faster processing of high volumes of data.

If you’ve ever wondered how tools such as Apache Hadoop work, you’re at the right page. In this article I’ll explain what map and reduce are and later also introduce a shuffle phase.