Over the years, use of text processors for desktop publishing has increased. This also makes correct hyphenation more important in order to make nice right margins of colums.
Our Dutch hyphenation patterns are quite old however, and suffer of incorrectness.
Partly this is because Dutch has changed in 2005, but mostly because it has properties, which are not supported well by existing software.
The properties that make hyphenation life hard:
Dutch words can contain ' as a character : testauto's ; the 'is not supported by pattern generator, nor by hyphenation.
The same applies to the dash -; in Dutch it is not even a correct hyphenation character: bevestings=e-mail (= is the hyphenation) is correct, but bevestigings-e=mail is not.
Even a . can be a word character (though exceptional) griffie=nrs.
Dutch is a compounding language. Luckily, one of the latest extensions of hyphenation is some support of this phenomenon. It needs an extra check however, since it might leave words hyphenated in incorrect positions. This check might be for the parts to be valid compounding parts, or leave a known word/part after splitting.
Dutch compounding has a compounding addition: belasting and test combined could be belasting=test as well as belastings=test. Since the appearance of the s is quite free (depending on meaning if the first part of the compound) this makes correct pattern detection hard. My belief is that we need hyphenating positions only at the start of end parts (=test).
Furthermore, Dutch words might change when hyphenated (omaatje => oma=tje). Luckily, this is also supported nowadays. But the substrings.pl which makes patterns for OOo, might ruin thse patterns.
Though most of the phenomena of Dutch are supported by hyphenation, none are supported by the tools to generate and process patterns (Patgen, OOo pattern generation process, hyphenation tools).
So either we start a very elaborate process of manual pattern creation, or we have these tools improved.
Are there any other language maintainers with similar or the same problems, willing to contribute (to specifications, programming or money) improving the hyphenation tools?