hyphen.tex
which can be found here, or somewhere in your TeX installation.
The contents of that file might appear a bit cryptic.
See this question on stackexchange: →.
Because the hyphenation patterns cannot possibly cover every possible word break in a language, there will be exceptions, cases where TeX fails to hyphenate a word properly.
In a document I was working on recently ‘typography’ was broken as ‘typog-raphy’, which looked strange to me.
You can instruct TeX to break a word the way you want it by putting hyphenation exceptions at the beginning of the document like this –
\hyphenation{ty-po-gra-phy}
But I discovered that this wasn’t actually a hyphenation failure: it’s because TeX was using its default US English hyphenation rules. If I made TeX use British hyphenation patterns by setting \uselanguage{ukenglish}
I got ‘typo-graphy’ without having to use exceptions.
British English tends to break words etymologically, whereas American English breaks words syllabically. →
Modern TeX engines can use hyphenation patterns for many languages. The file hyphenation.pdf (→) lists all hyphenation patterns for TeX, which have been collected in a single package hyph-utf8
, which can be used in pdfTeX, XeTeX, or LuaTeX.
An example of their use is given in the following files: hyph.tex, hyph.pdf
These files use the commands \uselanguage{ukenglish}
, \uselanguage{ngerman}
, etc to switch between hyphenation patterns. If you find those commands a bit long, you can create an alias like this:
\def\de{\uselanguage{ngerman}}
now \de
will apply German hyphenation.
You can maintain hyphenation exceptions for more than one language if you load the patterns before \hyphenation{}
:
\uselanguage{ukenglish}
\hyphenation{man-u-script man-u-scripts ap-pen-dix also into upon}
\uselanguage{ancientgreek}
\hyphenation{δε-δογ-μέ-νον Λα-κε-δαι-μονί-ων δύο ἀπὸ}
Entering a word in \hyphenation{}
without any hyphens means it will not be hyphenated at all.
There is another method which allows more control over the number of letters before and after a word break:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
These macros written by Karl Berry allow you to change the values of
% for hyphenation profiles and European languages
{\catcode`@=11
\gdef\eplainsetlanguage#1#2#3{%
% do not set the language if the name is undefined in the current TeX.
\expandafter\ifx\csname lang@#1\endcsname \relax
\message{no patterns for #1}%
\else
\global\language = \csname lang@#1\endcsname
\fi
% but there is no harm in adjusting the hyphenmin values regardless.
\global\lefthyphenmin = #2\relax
\global\righthyphenmin = #3\relax
}}%
\def\ukenglish{\eplainsetlanguage{ukenglish}{2}{3}}
\def\ngerman{\eplainsetlanguage{ngerman}{2}{3}}
\def\russian{\eplainsetlanguage{russian}{2}{2}}
\def\latin{\eplainsetlanguage{latin}{2}{2}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\lefthyphenmin
and \righthyphenmin
. \lefthyphenmin
sets the minimum number of characters before a hyphenation, \righthyphenmin
the minimum number of characters after a hyphenation. \lefthyphenmin
is normally set to 2 and \righthyphenmin
to 3, so the word ‘advocate’ could be broken ad-vocate or advoc-ate, but advoca-te would not be permissable. The 2, 3 settings seem to be used in English, French and German. But other languages may have different settings.
Also in \def\ukenglish{\eplainsetlanguage{ukenglish}{2}{3}}
the \ukenglish
can be changed to something different if you prefer, e.g.
\def\english{\eplainsetlanguage{ukenglish}{2}{3}}
\def\deutsch{\eplainsetlanguage{ngerman}{2}{3}}
\def\rus{\eplainsetlanguage{russian}{2}{2}}
\def\francais{\eplainsetlanguage{french}{2}{3}}
\def\gaeilge{\eplainsetlanguage{irish}{2}{2}}
Here is the example file above but using the Eplain hyphenation macros. I’ve also tested the hyphenation a bit more by setting the type in two columns using Eplain’s \doublecolumns
macro: hyph2.tex, hyph2.pdf
Of the two methods the first is easier to use. It also works better when used within macros.