Temporenc, comprehensive binary encoding format for dates and times

Today I published a new side project of mine: temporenc.

Temporenc is a comprehensive binary encoding format for dates and times that is flexible, compact, and machine-friendly. I’ve put up a website for the project at temporenc.org.

Please give it a look and consider using it in your libraries and applications! If you like, you can discuss it on Github or Hacker news.

Plyvel, Pythonic bindings for LevelDB

I’m happy to introduce a new open source project of mine, Plyvel. Plyvel is a fast and feature-rich Python interface to LevelDB, a fast key-value storage library written at Google.

The Plyvel documentation contains a lot of information on its usage and installation. Visit Plyvel on Github for sources and issue tracker.

Introducing HappyBase

I’m happy to announce HappyBase, a developer-friendly Python library to interact with Apache HBase. HappyBase is designed for use in standard HBase setups, and offers application developers a Pythonic API to interact with HBase.

More information is available from the HappyBase documentation, including an installation guide, tutorial, and API docs. The HappyBase sources are on Github. The PyPI page on HappyBase makes it easy to find with tools like pip.

Gnome 3 has been released!

Today, Gnome 3 has been released. The Gnome 3.0 release notes contain lots of information, and a Dutch translation of the Gnome 3.0 release notes (translated by yours truly) is also available.

I am Gnome

Check out the Gnome 3 website and the regular Gnome website for more information about this exciting new release!

Gnome loves passwords

By default, password entries in GTK+ applications show a black circle when you type a character. Boring! With just a few days to go before Guadec starts, this is the time to show that you love Gnome. And you know what? You can make Gnome love you too in return!

Put this snippet in the file ~/.gtkrc-2.0 (create it if it does not exist):

style "entry"
  GtkEntry::invisible_char = 0x2665
class "GtkEntry" style "entry"

From now on, all password entries will show a black heart (♥♥♥) for each character you type instead of a black circle. Yes, Gnome loves you!

Other characters are of course also possible. Use the Gnome Character Map application to find a Unicode symbol to your liking, and put the hexadecimal number for that character in the gtkrc snippet shown above.

See you all at Guadec next week!

Guadec 2010 Registration Opens for Participants

The Guadec 2010 participant registration has officially opened! Read the press release on the Guadec site.

Guadec logo

Preparing the Guadec 2010 schedule

Today, a team of local Gnome and Guadec people gathered at Revelation Space to discuss several topics, including the talk schedule.

Guadec schedule preparations

We used card sorting techniques to group together related talks, and used sticky notes to spread the various topics over the three core days. The outcome is a preliminary schedule, but it needs some checking up and additional care before it can be published. Stay tuned!

Guadec schedule preparations

And yes, this is a very serious matter yet a very informal meeting, so the suits were totally appropriate.

New coordinator for the Dutch Gnome translation team

The (now former) coordinator of the Dutch translation team, Vincent van Adrighem, has been active for more than 9 years now, and despite the small size of the Gnome-NL team, has done a marvellous job getting the Dutch translations into a good shape. However, after so many years, in all his wisdom, he has decided to step aside… which means:

Le coordinateur est mort — vive le coordinateur!

Taking effect immediately, yours truly has taken over the coordination role for the Dutch translation team. This means Vincent is a mere mortal again after being demoted to the ‘translator’ role. My proposal to take over the coordinator role and Vincent’s positive reply (both in Dutch) can be found in the Gnome-NL mailing list archives.

This change is made formal by my announcement and Vincent’s acknowledgement (both in English) on the general Gnome internationalisation (gnome-i18n) mailing list.

Many thanks to Vincent for his great work over the years!

Ssscrape released as open-source software

I’m pleased to see that the people at the ILPS group of the University of Amsterdam have released a project I have been working on in the past under an open source license (LGPL).

Ssscrape is a system for collecting and processing dynamic web data. Ssscrape stands for Syndicated and Semi-Structured Content Retrieval and Processing Environment, and provides a framework for crawling and processing dynamic web data such as RSS/Atom feeds. Ssscrape is mostly implemented in Python and MySQL, but it should be noted that processing tasks can be implemented in any programming language, since Ssscrape simply invokes external executables. More information about Ssscrape can be found at the Ssscrape website.

I am no longer involved in the project, and from a quick glance I see that many things have changed since I last touched the code about one and a half years ago… really nice to see that the project is still alive.

Oh, and yes, I actually invented the bizarre acronym. I still think it’s a really cool and appropriate name.

Calculating the contents of fixed size pagination controls

When a web application needs to display many items, e.g. search results or large lists of records, it is often desirable to chunk the total list of items into equal-sized pages for easy navigation. This process is called pagination. Alternative techniques like continuous scrolling might also be worth considering, but this blog article is just about pagination.

If multiple pages of results are available, navigation links should be displayed on the output pages so that users can browse to other result pages. The list of links is what I call a pagination control. A pagination control could look something like this, where each item is a link to the corresponding page.

previous 1  5 6 [7] 8 9  15 next

In my examples the active page is shown in square brackets. I also set the display width to 9 (see below). For all examples the total number of pages is assumed to be 15, unless stated otherwise.

Controls like these are quite intuitive to use, and many websites (e.g. search engines) use pagination controls similar to this one, with subtle differences in their implementations. For example, some have ‘first’ and ‘last’ links, some don’t. There are many other choices to make.

Implementing pagination controls like the above seems trivial at first sight, but there are a few corner cases to consider, and it takes some thinking to get all cases right.

Display width

In my implementation, I assume a fixed number of links, so that the resulting output is always has more or less the same size, which I find very useful since the control would look roughly the same on all pages. I use the term display width to denote this value. In the example above the display width is set to 9. The gaps (shown with an ellipsis) are also considered, since those take roughly the same space as the links to the pages. (Optional ‘previous’ and ‘next’ links are not counted.)

A small exception to the fixed display width is that if there are less pages than the display width of the control, the complete list of pages is shown. For example, if there are only 8 pages in total, it looks like this:

previous 1 2 3 4 5 6 [7] 8 next

Note that the display width should be set to an uneven number to ensure a nicely balanced output. (For even display widths, the algorithm favours showing one extra link after the active page, since if a user is making its way through many pages, it is much more likely the user navigates in forward order.)

Which links to show?

The control should always show the active, first and last pages, which make for three items in the list. In the remaining space, the control should show as as much context around the current page as space (defined by the display width) permits.

Gaps within the range of page numbers should be easy to spot to make it clear there are more pages available than the visible links. Gaps should be avoided if possible, so when the active page is close to the first or the last page, the control should try to align the numbers so that only one side of the control has a gap. The example below should clarify this:

[1]  2   3    4   5     6    7       15
 1  [2]  3    4   5     6    7       15
 1   2  [3]   4   5     6    7       15
 1   2   3   [4]  5     6    7       15
 1   2   3    4  [5]    6    7       15
 1      4    5  [6]    7    8       15
 1      5    6  [7]    8    9       15
 1      6    7  [8]    9   10       15
 1      7    8  [9]   10   11       15
 1      8    9  [10]  11   12       15
 1      9   10  [11]  12   13   14   15
 1      9   10   11  [12]  13   14   15
 1      9   10   11   12  [13]  14   15
 1      9   10   11   12   13  [14]  15
 1      9   10   11   12   13   14  [15]

So, given these requirements, how to decide which links to show in the pagination control? The problem at hand is defined by three variables: the display width, the total number of pages, and of course the active page.

I wrote an algorithm that (as far as I can see) satisfies all constraints expressed above for all display widths of at least 7, since a display width of less than 7 items does not make any sense the reason why is left as an exercise to the reader. (Hint: pagination controls are designed for navigating to other pages.) A quite clean Python implementation, which I hereby put in the public domain, can be obtained here:

Download pagination.py

Just run the script to see some example output. Porting this code to other languages should be trivial. Rendering nice XHTML out of the resulting list of numbers is very application-specific and hence left as an exercise to the reader.

With this approach showing back and forward buttons only if appropriate is trivial. If the current page is larger than 1, a ‘previous’ link should be included. Similarly, if the current page is smaller than the number of pages, a ‘next’ link should be shown. ‘First’ and ‘last’ links should not be rendered, since page 1 and the last page are always included in the output and extra links would not offer the user anything that the other links already offer.

Want more? Feel free to browse my archives (see right column) or use the category labels for more posts on similar topics.