hyphenation - How do I get LaTeX to hyphenate a word that contains a dash?

Keywords:latex 


Question: 

In a LaTeX document I'm writing, I get an overfull hbox warning because of the word "multi-disciplinary", which happens to be rendered at the end of a line.

I can get rid of this particular warning by changing it into multi-discipli\-nary, but the same problem will happen elsewhere, since this word is used a lot in the paper.

I'd like to use the \hyphenation{} command instead, but obviously my tentative \hyphenation{multi-disci-pli-na-ry} does not work, because it does not understand the first dash correctly.

What incantation do I need to get correct indentation in a word that already contains a dash?

Bonus question: Where could I have found the answer to that question myself?




8 Answers: 

From http://www.tex.ac.uk/cgi-bin/texfaq2html?label=nohyph:

TeX won’t hyphenate a word that’s already been hyphenated. For example, the (caricature) English surname Smyth-Postlethwaite wouldn’t hyphenate, which could be troublesome. This is correct English typesetting style (it may not be correct for other languages), but if needs must, you can replace the hyphen in the name with a \hyph command, defined

\def\hyph{-\penalty0\hskip0pt\relax}

This is not the sort of thing this FAQ would ordinarily recommend… The hyphenat package defines a bundle of such commands (for introducing hyphenation points at various punctuation characters).


Or you could \newcommand a command that expands to multi-discipli\-nary (use Search + Replace All to replace existing words).

 

The problem (as KennyTM noted) is that LaTeX won't hyphenate words with dashes in them. Luckily, there's a standard package (part of ncctools) that addresses that very problem, called extdash. This defines new hyphen and dash commands that do not disrupt hyphenation, and which can allow or prevent line breaks at the hyphen/dash. I prefer to use it with the shortcuts option, so I can use, e.g., \-/ rather than \Hyphdash. Here's what you want:

\usepackage[shortcuts]{extdash} ... multi\-/disciplinary

To prevent breaking at that hyphen, use multi\=/disciplinary

(Aside: The Chicago Manual of Style advises dropping the hyphens attaching affixes like 'multi', unless the word is ambiguous or unintelligible without it.)

 

I use package hyphenat and then write compound words like Finnish word Internet-yhteys (Eng. Internet connection) as Internet\hyp yhteys. Looks goofy but seems to be the most elegant way I've found.

 

I had the same problem. I use hyphenat plus the following macro:

\RequirePackage{hyphenat}
\RequirePackage{expl3}


% The following defs make sure words that contain an explicit `-` (hyphen) are still hyphenated the normal way, and double- and triple hyphens keep working the way they should. Just don't use a `-` as the last token of your document. Also note that `-` is now a macro that is not fully expandable

\ExplSyntaxOn

% latex2e doesn't like commands starting with 'end', apparently expl3 doesn't have any problems with it
\cs_new:Npn \hyphenfix_emdash:c {---}
\cs_new:Npn \hyphenfix_endash:c {--}

\cs_new:Npn \hyphenfix_discardnext:NN #1#2{#1}


\catcode`\-=\active

\cs_new_protected:Npn -{
    \futurelet\hyphenfix_nexttok\hyphenfix_i:w
}

\cs_new:Npn \hyphenfix_i:w {
    \cs_if_eq:NNTF{\hyphenfix_nexttok}{-}{
        %discard the next `-` token
        \hyphenfix_discardnext:NN{\futurelet\hyphenfix_nexttok\hyphenfix_ii:w}
    }{
        % from package hyphenat
        \hyp
    }
}

\cs_new:Npn \hyphenfix_ii:w {
    \cs_if_eq:NNTF{\hyphenfix_nexttok}{-}{
        \hyphenfix_discardnext:NN{\hyphenfix_emdash:c}
    }{
        \hyphenfix_endash:c
    }
}


\ExplSyntaxOff

Note that this uses the expl3 package from latex3.

It makes the - an active character that scans forward to see if it is followed by more dashes. If so, it stays a -, to make sure -- and --- keep working. If not, it becomes the \hyp command from hyphenat, enabling word breaks in the rest of the word. This is a generic solution that makes all words that contain explicit hyphens hyphenate normally.

Note that - becomes a macro that is not fully expandable, so try to include this after loading other packages that may not expect - to be a macro

Edit: This is my second version, the first version was less robust when a { or } followed a hyphen. This one is not, but unlike the first version the - in this version is not fully expandable.

 
multi\hskip0pt-\hskip0pt disciplinary

You can e.g. define like

\def\:{\hskip0pt}

and then write

multi\:-\:disciplinary

Note that the babel Russian language package has its own set of dashes that do not prohibit hyphenation, "~ (double quotation+tilde) for example.

 

multi-disciplinary will not be hyphenated, as explained by kennytm. But multi-\-disciplinary has the same hyphenation opportunities that multidisciplinary has.

I admit that I don't know why this works. It is different from the behaviour described here (emphasis mine):

The command \- inserts a discretionary hyphen into a word. This also becomes the only point where hyphenation is allowed in this word.

 

I answered something similar here: LaTeX breaking up too many words

I said:

you should set a hyphenation penalty somewhere in your preamble:

\hyphenpenalty=750

The value of 750 suited my needs for a two column layout on letter paper (8.5x11 in) with a 12 pt font. Adjust the value to suit your needs. The higher the number, the less hyphenation will occur. You may also want to have a look at the hyphenatpackage, it provides a bit more than just hyphenation penalty

 

To avoid hyphenation in already hyphenated word I used non-breaking space ~ in combination with backward space \!. For example, command

3~\!\!\!\!-~\!\!\!D

used in the text, suppress hyphenation in word 3-D. Probably not the best solution, but it worked for me!

 

More Articles


ABCPDF.Net AddText Control hyphenation

I'm using ABCPDF.net for generating PDF Pages. We've got a problem with the hyphenation system. For example if we add a text with long words using doc.AddText("This is a Verylongwordwhichdoesntfit");and the Rect is too small, we get: this is a verylongwo rdwhichdoesntfit.My Question now is:Can i c

itext - iTextSharp and Hyphenation

In earlier versions of iTextSharp, I have incorporated hyphenation in the following way (example is for German hyphenation):HyphenationAuto autoDE = new HyphenationAuto("de", "DR", 3, 3);BaseFont.AddToResourceSearch(RuntimePath + "itext-hyph-xml.dll");chunk = new Chunk(text).SetHyphenation(autoDE);I

xml - Parsing a tag containing special characters with "xml2" in R

I'm using the xml2 package in R to parse my xml file. Everything works perfectly, except this one tag, that has a dash in the tag name.XML Sample:<?xml version="1.0" encoding="UTF-8"?><abstracts-retrieval-response xmlns="http://www.elsevier.com/xml/svapi/abstract/dtd" xmlns:ait="http://www.


emacs - How to remove hyphens during fill-paragraph?

When I manually use fill-paragraph I would like to have emacs remove all previously inserted hyphenations (by others?). That means automatically replacing all "-\n" with "". How can I do that?

CSS / HTML - Can hyphenation be applied to <input type="submit" value?

I am using a form to send POST data, looking like a link, as i don't see another possibility in my case.I can obviously style the submitt-button to look like a link, but hyphens- or word-break parameters just won't apply. Some submitbutton's value text is longer than the surrounding div container, s

nlp - Ruby, Count syllables

I am using ruby to calculate the Gunning Fog Index of some content that I have, I can successfully implement the algorithm described here: Gunning Fog IndexI am using the below method to count the number of syllables in each word:Tokenizer = /([aeiouy]{1,3})/def count_syllables(word) len = 0 if wo


python - How to delete an item in a list if it exists?

I am getting new_tag from a form text field with self.response.get("new_tag") and selected_tags from checkbox fields with self.response.get_all("selected_tags")I combine them like this:tag_string = new_tagnew_tag_list = f1.striplist(tag_string.split(",") + selected_tags)(f1.striplist is a function t

Removing stopwords using NLTK in python

i am using NLTK to remove stopwords from a list element.Here is my code snippetdict1 = {} for ctr,row in enumerate(cur.fetchall()): list1 = [row[0],row[1],row[2],row[3],row[4]] dict1[row[0]] = list1 print ctr+1,"\n",dict1[row[0]][2] list2 = [w for w in

python - Stopword removal with NLTK and Pandas

I have some issues with Pandas and NLTK. I am new at programming, so excuse me if i ask questions that might be easy to solve. I have a csv file which has 3 columns(Id,Title,Body) and about 15.000 rows.My goal is to remove the stopwords from this csv file. The operation for lowercase and split are w

python - NLTK Stopword List

I have the code beneath and I am trying to apply a stop word list to list of words. However the results still show words such as "a" and "the" which I thought would have been removed by this process. Any ideas what has gone wrong would be great .import nltkfrom nltk.corpus import stopwordsword_lis