• mina86.com

  • Categories
  • Code
  • Contact
  • Map-reduce explained

    Outside of functional programming context, map-reduce refers to a technique for processing data. Thanks to properties of map and reduce operations, computations which can be expressed using them can be highly parallelised, which allows for faster processing of high volumes of data.

    If you’ve ever wondered how tools such as Apache Hadoop work, you’re at the right page. In this article I’ll explain what map and reduce are and later also introduce a shuffle phase.

    Slackware post install

    Same as my previous article written in Polish, this text will describe some steps I take after installing Slackware Linux. I try to strike a balance between performance, security and usability, but not everything written here may work for everyone. You have been warned.

    A.I.

    Cleaning Tiny Applications Collection, I’ve dropped both artificial intelligence scripts. Not wanting to let them disappear completely, I’m posting them here for posterity. The first one is an eight line of code version that might be what Sid wrote as his first ever program:

    #!/usr/bin/perl -wWtT
    while (<>) {
    	if (/[aeiouyAEIOUY][^a-zA-Z]*$/) {
    		print "Yes.\n";
    	} elsif (!/^\s*$/) {
    		print "No.\n";
    	}
    }
    USER FRIENDLY by IlliadCopyright © 1999 by J.D. “Illiad” FrazerI remember the first program I ever wrote was an “A.I.” that could play 20 questions. You could ask it any question in an attempt to guess what thing it was thinking about and it would answer yes or no. And I did it with only eight lines of code.Did you say eight lines of code?Yep! My friends never did guess the right answer. They asked Is it blue?” and it answered Yes.” Then, “Is it bigger than a breadbox?” It answered “No. Is it smaller than a breadbox? It answered “No.” “Does it go wiki-wiki-wiki?” It answered “Yes.Uh… It was thinking about a blue breadbox that goes wiki-wiki-wiki?The trick was in the code. It answered “Yes” to any question that ended in a vowel.
    UserFriendlsy comic strip for 2000-10-14 in which Sid describes his eight-line AI program.

    Standard-agnostic HTML code

    HTML has gone quite a long way since its inception. This means a lot of new features but also some small incompatibilities which may pose issues in certain situations. For instance, when posting a code snippet for others to include on their websites, it’s best if it works correctly on as many sites as possible which implies being compatible with as many versions of HTML as possible. But how to create a snippet that works both in HTML and XHTML? Here are a few tips:

    CSS sprites as background

    CSS sprites aren’t anything new. They have been around for years, and are one of the methods to optimise website’s load time. The idea is to incorporate multiple images into one and in this way decrease number of round trips between the server and the browser.

    In its traditional use, CSS sprites work as a replacement for images and cannot be used as a background. Alas that is exactly what I wanted to do with a quote and flag icons like the following:

    Example block quote with a quote icon and two paragraphs with flags

    Update: This website has evolved slightly since 2013. The flags are no longer used (replaced by content negotiation) and quote sprite icon has been replaced by an SVG. While I no longer use this technique, it is of course still valid.

    After some playing around I’ve finally figured out how to get this working. Even though there are some caveats, sprites can be used as a top-left no-repeat background image as well.

    The fifth generation

    Southwark Cathedral with The Shard skyscraper in the background
    (photo by Tristan Surtel)

    This day must have come sooner or later. Even more so since I love squeezing every byte out of the data being sent over the network, which is why source of this website is so unreadable (don’t worry though, readable sources are available in a git repository).

    So yeah. I’ve switched this website to HTML5 with some of it’s new elements and optional tags removed. After years of using XHTML 1.1 it feels a bit weird not closing tags, but I guess a few saved bytes are worth it, aren’t they? ;)

    I’ve even got my electric slash working in Emacs’s html-mode (ie. if I press slash after < sign, inner most element is closed automatically).

    Unfortunately, not all is so shiny. For some reason, automatic pagination on entries list page and ‘load content’ link stopped working under Opera. The way those work is by making an XMLHttpRequest and injecting portion of the fetched document in appropriate place. For some reason, Opera ends up with a DOMException: INVALID_STATE_ERR.

    SSL and dropping www. prefix using mod_rewrite

    Surprisingly I couldn’t find any HTTPS-aware examples how to drop the www. prefix from web hosts in Apache, so I had to come up with one myself. Firstly, the following lines need to find their way to the end of Apache configuration file (/etc/httpd/conf/httpd.conf or something):

    RewriteEngine on
    RewriteCond %{HTTPS} off
    RewriteCond %{HTTP_HOST} ^www\.(.*)$
    RewriteRule ^(.*)$ http://%1$1 [L,R=301]

    Secondly, analogous lines need to be added inside of the <VirtualHost _default_:443> directive of mod_ssl configuration file (/etc/httpd/conf.d/ssl.conf or similar), like so:

    <VirtualHost _default_:443>
    	# … various directives …
    
    	# Here’s what needs to be added:
    	RewriteEngine on
    	RewriteCond %{HTTP_HOST} ^www\.(.*)$
    	RewriteRule ^(.*)$ https://%1$1 [L,R=301]
    </VirtualHost>

    Now, after a restart, Apache will drop the www. prefix for both secure and insecure connections.

    CMA on LCE/ELCE 2012

    LinuxCon / Embedded Linux Conference Europe 2012 is nearly over, and I had a pleasure of talking about the Contiguous Memory Allocator. Slides from the talk are available for download and their source code can be accessed at GitHub.

    Unfortunately, in contrast to other LCE/ELCE conferences, talks were not recorded, so the video of the presentation is not available.

    For more links regarding CMA, I have set up a resource page at mina86.com/cma/. Beside the link to the final CMA patchset and to the LCE/ELCE presentation, it links to various articles and patches relating to CMA directly or indirectly.

    Lazy Proxy in Python

    Paths of destiny lead mysterious ways. Not so long ago, I was a hard-core C hacker and now, I spend a lot of the time coding in Python.

    In somehow related news, I have discovered that my search-foo is not good enough, when I was unable to find a decent implementations of several design patterns in Python.

    What I needed was a generic proxy that would defer initialisation of an object to the moment it is first used. Here is what I came up with:

    class LazyProxy(object):
        __slots__ = '__get'
    
        def __init__(self, cls, *args, **kw):
            object.__setattr__(self, '_LazyProxy__get',
                               lambda: self.__set(cls(*args, **kw)))
    
        def __set(self, obj):
            object.__setattr__(self, '_LazyProxy__get', lambda: obj)
            return obj
    
        def __getattr__(self, name):
            return getattr(self.__get(), name)
    
        def __setattr__(self, name, value):
            return setattr(self.__get(), name, value)
    
        def __delattr__(self, name):
            return delattr(self.__get(), name)

    Deep Dive into Contiguous Memory Allocator

    This is an extended version of an LWN article on CMA. It contains more detail on how to use CMA and a lot of boring code samples.

    Contiguous Memory Allocator (or CMA) has been developed to allow large physically contiguous memory allocations. By initialising early at boot time and with some fairly intrusive changes to Linux memory management, it is able to allocate large memory chunks without a need to grab memory for exclusive use.

    Simple in principle, it grew to be a quite complicated system which requires cooperation between boot-time allocator, buddy system, DMA subsystem, and some architecture-specific code. Still, all that complexity is usually hidden away and normal users won’t be exposed to it. Depending on perspective, CMA appears slightly different and there are different things to be done and look for.