Czech strxfrm and strcoll implementing four-pass collation
I wrote an implementation of function strxfrm that converts Czech (ISO-8859-2) text to sequence that can be compared using strcmp. The conversion is defined in such a way that it as closely matches Czech standard (ČSN 97 6030) and its interpretation by Petr Olšák. If you have some problem with the result, for example you are not happy with numbers ordered only after letters z and ž, read the standard first. The same algorithm is used in the czech character set in MySQL and in module Cz::Sort.
The file also contains function strcoll that compares two strings without a need of previous conversion, in constant memory. The file is compiled using
cc -c -o csort.o csort.c
ld -shared -o csort.so csort.o
we turn it into a shared library. We then use it for example by setting environment variable
which ensures that instead of the default strxfrm and strcoll the Czech ones will be used.