Ivan Čukić

Tests for code that uses a database

Most sane programmers hate writing tests. Because of that, we have a lot of testing frameworks around that make that task more streamlined, but it is still borring work.

A lot of sane programmers hate writing anything that needs (dynamic) SQL. For that, a lot of DB frameworks came into existence. One of the fancier ones for C++ has to be sqlpp11 which provides a type-safe way of writing SQL usnig regular C++.

But, alas, this is not a post about sqlpp11 since it is not something that can be used in KDE Frameworks since its heavy use of C++11 features.

What is this all about then?

This post is more about how to write tests for code that uses the Qt’s way of working with databases – the JDBC-inspired QSql framework.

I’m currently working on a new part of the KActivities framework which will allow applications to retrieve usage statistics in order to provide you with ‘most used documents’ and improved ‘recent documents’ lists, and a few other things as well.

The library needs to provide a nice querying mechanism, different sorting and filtering options, but without exposing the actual database schema and giving the clients all the “destructive” :) power of SQL.

Testing

One of the approaches is to provide a predefined “dummy” database to the tests, and write the tests that check for the corner cases that you can predict, along with a few expected ones.

This works well, but might not be sufficient. It would be nice if we could generate a larger corpus of data to test against.

Generating the data is not that difficult – just a few for-loops and a lot of rand() calls, just make sure you are generating valid data.

In-memory database

Now comes the tricky part, and that is reimplementing the logic that used the database with standard collections like std::vector (or QVector) and others.

When that is done, it is trivial run every possible query combination that you have against the database and the in-memory collections and compare the results.

Schema

As with the regular databases, you first need to define the schema you want to work on. We can define the records as simple structs (do not judge the usefulness of the example :) ):

struct Employee {
    QString companyId;
    QString jobId;
    QString name;
    QString surname;
};

Now that we have the record, we should probably define a primary key for it. An easy way would be to add a method called PRIMARY_KEY to all the records. It can either return a single column, or a tuple of more than one of those. Something like this:

struct Employee {
    // :::

    std::tuple<const QString &, const QString &>
    PRIMARY_KEY() const {
        return std::tie(companyId, jobId);
    }
};

It is now trivial to create in-memory tables that obey the uniqueness of these keys.

// Single-time boilerplate
struct RestrictOnPrimaryKey {
    template <typename T>
    bool operator() (const T &left, const T&right) {
        // tuples have natural order
        return left.PRIMARY_KEY() < right.PRIMARY_KEY();
    }
};

template <typename T>
using Table = std::set<T, RestrictOnPrimaryKey>;

// Nice code
Table<Employee> employees; 
Table<Companies> companies;
Table<OtherStuff> otherStuff;

std::set (or the equivalents from boost and Qt) will ensure that you can not add more items with the same key.

So, this is the schema. The basics are here, you could even play around and try to create foreign keys, and multiple unique keys on a single table, but that is out of the scope of this post.

Queries

Now, the only thing left is to mimic some SQL queries. We will not do anything fancy, but some basic SQL concepts are easily transferable to the C++ world using the boost.range library.

WHERE clause

Filtering these collections on a specific predicate is really easy, you can just use the boost::adaptors::filtered class:

    employees | filtered([] (const Employee& e) { ... })

The predicate can be any function, a class with operator(), or a lambda.

ORDER BY clause

ORDER BY is simply called std::sort. Even better, it is called boost::sort.

    sort(employees, comparisonFunction)

Again, the argument can be any function, a class with operator(), or a lambda. Since filtering and sorting on a specific column of a table is a common use-case, it would be an overkill to write lambdas for all of them all the time. We can easily do something like this (it is easily adapted for filtering):

// Single-time boilerplate
#define DECL_COMPARATOR(Type, MemberType, MemberName) 
    Comparator<MemberType> 
    MemberName##Comparator(&Type::MemberName) 

template <typename MemberType>
struct Comparator {
    Comparator(MemberType Type::* memberptr)
        : memberptr(memberptr)
    {
    }

    bool operator() (Ref left, Ref right) const
    {
        return left.*memberptr < right.*memberptr;
    }

    MemberType Type::* memberptr;
};

// Nice code
DECL_COMPARATOR(Employee, QString, name);
DECL_COMPARATOR(Employee, QString, surname);

// Sort by name
sort(employees, nameComparator)

// Sort by surname
sort(employees, surnameComparator)

SELECT clause

Selecting is, again, simple. Just use the boost::adaptor::transformed.

Food for thoughts

It might be fun if one would use a library like sqlpp11 for the tests while implementing the main logic manually…

This is all for now, I’m tired of writing. Back to the code. :)

EDIT: An alternative to sqlpp11 which also makes SQL type safe is sqlate from our friends at KDAB. I wish all the best to both projects, can not wait for the day when I can stop writing raw SQL.

[book review] Application Development with Qt Creator, Second Edition

I got a copy of the Application Development with Qt Creator, 2nd ed. for review, so I decided to post the review here – KDE is still the greatest Qt community in the world, and we have more than a few students and teachers in it which might benefit from a book like this one.

Before I start, I ought say that lately I’m used to reading some more involved material on C++, Haskell and category theory. I’m just pointing this out in the case I sound a bit more negative in this review than the book deserves.

Target audience

First of all, who is this book meant for? It seems to be aimed at CS students (or others who had no contact with Qt before) who want to learn Qt when they already know C++ to some degree. It generally requires only basic understanding of C++ – I’m even wondering whether a lonely Java or C# programmer would be able to start coding in Qt with this book, without any prior knowledge of C++ (if somebody tries to, please let me know how it fares).

Topics covered

A solid part of the book covers the Qt Creator, from its debugging environment to QWidget and QML UI designers. These parts are nicely written, and provide sufficient explanations for anybody to get started with it. There are a few places where a graphical representations would make more sense than textual explanations (QWidget layouts), but I guess those are simple enough not to need the explanations at all.

Other parts cover topics from Qt’s collection classes, network connections and xml parsing, to Qt Quick with multimedia and sensors. The topics are not covered in-depth, but rather provide a brief overview. Each section includes explanations of most important classes, their APIs and how they are used.

Due to its brevity, I think the book misses to mention a few important bits like QMutexLocker while talking about mutexes, or to explain the difference between QList and std::list (like why the former is less evil than the latter :) while comparing them.

The chapter that I was positively surprised to see is the “Optimizing Performance with Qt Creator” which deals with QML performance analyser and Valgrind. It is usually not a topic covered in beginner courses, and it really ought to be.

Summary

Now, a short summary that does not really sum up anything. I can not make up my mind about this book.

The fact it focuses only on Qt, and does not cover it in-depth, would make it insufficient for my course (I teach C++ and Qt to the final year B.Sc. students).

I can see it as a viable (and even good) option for the early CS courses if the students have already had some basic programming course that covered C++ basics beforehand. For this stage of learning, the covered topics tend to be interesting (UI, multimedia), useful (IDE, profiling) and not over-demanding.

I realise I’m hammering on about students. It is simply because the author implied that they are the target audience, and due to my position at the university.

What about everyone else?

Since I can not make up my mind, and say “buy it!” or “run away!”, my advice is to go to the store and check it out for yourself. It is well written, and easy to read, but you should see whether the format suits you. It can be a nice introduction to Qt if you are not from around these parts. I’ve found the chapter 7 to be the most representative of the rest of the book, so when you check that one out, you’ll know what you are getting.

A few more bits

Unsorted and uncovered parts from my notes:

  • I haven’t checked out the price-tag, didn’t want to influence my expectations;
  • I’m sometimes overly harsh when reviewing things – when I find dubious and incorrect statements (and expect the author to know better), I can go berserk – it did not happen in this case, not once;
  • I sometimes get annoyed by layout and printing errors – it happened only once or twice with this book :);
  • Thanks to Packt for giving me the book to review.

Meeting C++ and fantastic people

I got back from Meeting C++ and I must say I loved every second of it. At first, it was a bit strange – I’m accustomed to KDE/Qt conferences where I know a lot of people. Here, it was not the case. It is a bit sad to see that barely anyone from the Qt community was there (apart from a few KDAB people), but that is a separate topic.

The conference started with the great Scott (pun intended) Meyers. The talk was less technical than most of us expected, but it was really awesome. It was filled with great advice for anyone wanting to write books or give talks. It even made me change a few parts of my presentation which was scheduled for the next day.

In Scott’s first slide, he had shown my favourite monument in Berlin – Soviet War Memorial at Treptower Park. I followed, and raised him the one at Tiergarten.

Treptower, Tiergarten

It was a trully awesome feeling to speak in front of people like Scott Meyers, Hartmut Kaiser and Detlef Wilkening. And it was fantastic to see that people are really interested (quite surprising for me) in monads and asynchronous programming. I got a few questions in the Q&A section, and much more afterwards.

The next step for me is C++ Russia meeting in Moscow. It seems I’ll have the chance to meet Bartosz Milewski and Sean Parent there. Can’t wait!

Me at MeetingC++

Meeting C++ with Monads, Berlin

I’ve been silent for the past month. More than a few obligations took me away from my regular kde schedule.

A few of those obligations are culminating in my talk at Meeting C++ this Saturday in Berlin, which will be about Monads put into chains. It is something that started in KActivities during the time I was sponsored by basysKom to work on it.

Afterwards, I saw a couple of blog posts regarding monads in c++, mostly written by Bartosz. Strangely enough, the first time I saw someone writing about the continuation monad in c++ was a few months after I started to use them in my code.

Then, I realized it might be interesting to a wider audience, and I have continued to work on refining those ideas into something really fun and useful.

Some parts have been presented at QtDevDays2013, some will be presented now at Meeting C++, and (hopefully) some finishing touches put in C++ Russia conference in Moscow this February.

p.s. I’m arriving to Berlin tomorrow, if anyone wants to go for a beer. :)

API Design Part 2: Impact on the safety

Continuing on the topics I talked about at this year’s aKademy conference.

Most of the things I want to write here are not new to people who are immersed in C++ and follow the books/presentations by Alexandrescu, Sutter, and others. But, I’ve found that in our Qt-sub-culture, it is often not the case.

Qt is very good at hiding the ugly parts of C++, but at the same time, it sometimes hides too much.

Memory safety

One of the first things you learn in Qt is about the object trees and ownership where the parent obect conviniently destroys its children on its destruction.

IMO, this is one of the nicest features of Qt. But, sometimes, it tends to provide a sense of false security. As trivial example, imagine the following:

QThread *thread = new QThreadDerivedClass(this);
thread->start();

It looks nice – we are creating a new object, we pass it a parent (this), so we do not need to worry about its destruction ourselves. The issue is that the parent is often a long-lived object like a QCoreApplication or a main window. This tends to end up in the thread object not being destroyed until the application has been terminated.

Now imagine that the above object is an image cache, and you’ll have a quite substantial memory leak.

The parent-child ownership is a silver bullet in a lot of cases, but not all.

What does a pointer mean?

So, lets return to the topic at hand – API design. One of the biggest problems when using a new library is what the following declaration means:

SomeType * someMethod();

Namely, the question here is what does the ‘SomeType *’ mean. It can mean quite a few things:

  • (static) singleton instance
  • should be disposed by the user
  • creator-owned, creator disposes of it
  • an optional value? (for example, parsing a number from a string could return a null if parsing failed)
  • position in an array? (this one is rarely used nowadays, we have iterators for that case)

The problem here is deducing who owns the returned object, and are we guaranteed to get a non-null result at all. It can not be deduced without reading the documentation, and it could and should be.

1. Singletons

Since a singleton should be always present, there is really no point in making it a pointer at all. Implementing a proper singleton should be as easy as:

SomeType & instance() 
{
    static SomeType s_instance;
    return s_instance;
}

It is thread-safe (in C++11), its declaration clearly states that it returns a non-null object, and an object with a long lifetime.

2. Factories

The next are the factory functions that return an instance of the object whose owner should be the callee, and the callee is responsible for its destruction:

std::unique_ptr<SomeType> createObject(...);

Even if the callee forgets to save the returned value, nothing will be leaked.

3. Caches, ref-counted singletons, etc.

When we have a function that returns something that can be destroyed at any time, or that should be destroyed only after everyone stops using it, we can simply write

std::shared_ptr<SomeType> getObject(...);
// or
std::weak_ptr<SomeType> getObject(...);

The first one tells the callee that the object will exist for as long as he wants to use it, but without the guarantee that it will be destroyed immediately after.

While the later says that it has been given an object that could go away at any point in time.

4. Optional result values

The last use-case is the most problematic one. Not because it has not been solved, but rather because the necessary class has not yet been provided in C++.

When it becomes the part of the standard, it will look something like this:

std::optional<SomeType> parseTheType(...);

For the time being, you could use boost::optional, pair<bool, SomeType> and similar.

If you wanted to give the failure error as well, without resorting to throwing exceptions all over the place, my advice is to go and watch Alexandrescu’s talk on “Systematic error handling in C++”.

Exception safety

Now, after we saw what can be done to the API to make the user’s life easier when it comes to memory management, just a very short note about the exception safety.

There is a reason why std::stack does not have a method pop that returns a value, but has the separated top and pop. This is, again, one of the things that should be known to most c++ developers, yet sometimes you can even find some c++ book authors that take jabs at the standard committee for making it that way, and not going for the more convenient API.

I suggest everyone to look at some writings about this – the issue itself gives a nice overview of things to watch out for when designing API which should behave well in the exception-enabled environment.

Build profiles addon script for kdesrc-build

I’ve been keeping a set of scripts to keep parallel builds of a few projects, to be able to test whether everything behaves well on older compilers. It is also nice to compile things with clang while developing since it usually provides nicer error messages compared to gcc, even if might be slower or generate slower code.

The main problem is that the setup was not really scalable. Adding new build profiles, or projects was not as easy as it ought to be.

Enter kdesrc-build-extra

kdesrc-build-extra is a simple tool that creates profile-based alternative builds to those created by kdesrc-build. It does not do it for all projects, but only for those that you choose.

ksbextra

It allows you to create a few profiles, and specify which projects you want built with each of them. So, for example, you can keep parallel builds of plasma-workspace with gcc 4.5 and the latest clang, while having a static checker like clang-analyze for plasma-framework.

The example configuration file comes with the program. The format is the same as the one used by kdesrc-buildrc, just with a few custom fields.

This is one of the profiles from the provided example configuration file:

# Build profile for building with the clang compiler
build-profile clang-build

  # Prefix to use for building this profile
  build-dir    /opt/kf5/build-clang/

  # [optional] Where to install the binaries from this profile
  # install-dir /opt/kf5/usr-clang/

  # C++ compiler executable
  cxx-compiler /usr/bin/clang++

  # C compiler executable
  c-compiler   /usr/bin/clang

  # Which options to remove from the kdesrc-build setup
  # when building this profile.
  # Parameters covered by the above setting values are
  # automatically removed.
  cmake-options-remove -DUSE_COMPILER_HIDDEN_VISIBILITY \
                       -DKDE4_BUILD_TESTS \
                       -DBUILD_TESTING

  # What should we add to the parameter list?
  cmake-options-add    -DUSE_COMPILER_HIDDEN_VISIBILITY=0

  # Which projects do you want to build using this profile
  projects kactivities plasma-framework

end build-profile

Installation

Now comes a bit weird part. Since these kinds of scripts in KDE tend to cover all programming languages in existence – perl, ruby, python and similar, I decided to go over the edge and do this in Haskell.

Thanks to Haskell’s package manager, the installation is quite simple.

cabal update
cabal install kdesrc-build-extra
cp ~/.cabal/bin/kdesrc-build-extra /path/to/your/kf5/sources

(you just need to install cabal and ghc before doing that)

Usage

Copy the kdesrc-build-extrarc-example file into the KF5 sources directory, edit it to fit your setup, and that is it.

API Design Part 1: Impact on the Performance (Qt vs STL example)

First of all, this post is not meant to criticize Qt in any way, just to raise some thinking points for people who create libraries.

After my talk at aKademy 2014, I’ve decided to start a short series of blog posts about some considerations to be had when designing public API and overall practices to make your code safer and cleaner.

API Design

When we are thinking about the API of a library we are crafting, we usually tend to think only about how easy it will be for the user to use it. And I’d say that Qt is doing very well in that regard.

The problem is that the API design should not be only about that.

When crafting code, we usually ballance between keeping it readable by others, making it execute as fast as you can, to use as less memory as posible etc.

API design should be the same. Although, in this case, you can give a greater priority to the readability, but not by completely sacrificing the other parts.

A small example

A nice example of this sacrifice can be seen when comparing QTextStream::readLine to std::getline.

    // API shortened to look prettier
    QString QTextStream::readLine();
    std::istream& getline(std::istream &input, 
                          std::string &line);

The Qt version looks much nicer – it reads a line, and returns it. It can not be any simpler.

The STL version, on the other hand, returns the (rest of the) stream after the line is read, and it returns the actual line through the out argument. Is there anyone who actually likes the out arguments?

The benefit of this is that you can do a if(std::getline(:::)) and get whether the stream is still valid, which is useful sometimes. But, still, it looks as a harder function to use than the one from QTextStream.

Things start to look a bit different once you start thinking about the most common use case for using a function that returns a line.

    // taken out of the QTextStream documentation
    QTextStream stream(:::);
    QString line;
    do {
        line = stream.readLine();
        // do something
    } while (!line.isNull());

The equivalent code using the function from the STL would be something like this:

    std::ifstream stream(:::);
    std::string line;
    while (std::getline(stream, line)) {
        // do something
    }

After seing the examples, it is a little bit hard to mark the QTextStream::readLine as the clear winner when the ease-of-use is concerned.

Measuring the speed

But, the point of this post is not to talk about that.

The point is to show that the API design has influcence on the other aspects, and not only on readability. The nice API of Qt comes with a significant performance penalty.

I’ve just tested the previous code snippets on a text file that has somewhere around 56000 lines. Each line of the file was longer than 50 characters to exclude the possibility for the libraries to use the short-string optimizations.

The results were measured using the clock() function in the ctime header (start_clock = clock(); ... clock() - start_clock), and the tests were repeated quite a few times:

* Qt code took ~41000 // QString
* Qt code took ~30500 // edit: QByteArray 
* STL code took ~8700

The STL version is almost 5 times faster (edit: 3.5 times compared to QByteArray version).

And it is not because Qt’s implementation is slow or anything. It is mainly because of the API design.

Why is that?

I’ll write the answer below in the comments section to allow you to think about what could be the issue here.

edit: Added QIODevice/QByteArray to the above mix, as suggested by csslayer.

ottens.js

Introducing ottens.js inspired by Kevin’s great talk at aKademy script that does to your web page what should have been done a long time ago. Just call the function when your page has been loaded.

function ottensize() {
    var html = document.body.innerHTML;
    html = html.replace('hacking', 'crafting');
    html = html.replace('hacked', 'crafted');

    document.body.innerHTML = html;
}

… and they pop up on your desktop

If you like to keep your project-related files on your desktop for easy access, you might have kept links to them in different folders which you placed in a folder view.

Now, it is much easier, just link them to the activity they belong to, and set the folder view to display it.

folderview

Linking files to activities

News from the Society for Putting Things on Top of Other Things:

Another feature has returned. This time with less issues and much more speed.

Linking

(yes, activities do need a new icon :) )