Lucie Haskins—Jumping on the embedded indexing bandwagon—or should I? (ISC conference 2015)

Embedded indexing is still evolving as the relatively new ebook industry finds its legs. Ebook indexing is so new that it’s a bit of a Wild West, with different software, standards, and processes competing for space. Clients may hear the buzzwords and turn to you for answers. Should you make the jump to embedded indexing? Lucie Haskins looked at some of the issues you should consider when deciding.

Unlike back-of-the-book (BoB) indexing, in which you receive designed files, either in hard copy or PDF, from the client and write an index in RTF or DOCX format, which the client then typesets, embedded indexing is done in the native file, whether it’s in Framemaker, Word, InDesign, XML, or HTML. You tag the text with index terms and send the file back to the client. In Haskins’s words, “You receive their baby, you manipulate their baby, and you send it back to them. It’s a huge responsibility.”

Some limitations of native indexing modules

Creating terms

  • No index preview
  • No autocomplete of index entries
  • Tiny marker boxes
  • Poor control of special strings, such as page range, italic or bold formatting, and cross-references

Editing terms

  • No change propagation of index entries
  • No index preview
  • No viewing indexing entries in the document
  • No temporary grouping of index entries

As a result, Haskins said that you can expect to spend 50 to 100 percent more time on embedded indexing compared with BoB indexing.

Some benefits of native indexing modules

Creating terms

  • Autogenerated entries

Work process

  • Indexer can start before final pages
  • Indexing concurrent with proofreading
  • Potential reuse in future editions, other formats

Issues specific to embedded indexing

  • access control and time constraints
  • software versions
  • version control on files and downloading/uploading

You and your client will have to discuss what software (and what version of that software) to use. For example, if you and your client are using different versions of InDesign, one of you will have to convert the file to IDML. If you don’t have the client’s fonts, your system will substitute a font that will affect flow and pagination, which means that the final index would have to be regenerated by the client. At that point, the client would have to be responsible for formatting text to italic, because InDesign doesn’t allow italicized text in index entries. Each entry has to be formatted manually. and the formatting disappears whenever the index is regenerated.

Should you bother with embedded indexing? Haskins says you shouldn’t feel you have to, unless existing or prospective clients have approached you directly about it and you have an interest in it. Haskins doesn’t recommend jumping on the bandwagon otherwise, because the field may evolve into something else entirely in a few years. For example, there are hints that BoB indexing using anchors at the paragraph level may be where the field ends up. It would use techniques familiar and intuitive to indexers and would obviate the need for specialized software. Buying all of the software and upgrading your equipment would be a significant investment of money; educating yourself and your client on the software and the process would be an investment of time.

If you do want to learn embedded indexing, however, Haskins suggests

Indexing in Adobe InDesign Creative Cloud—Judy Dunlop (ISC conference 2014)

Thanks to the advocacy efforts of the American Society for Indexing’s Digital Trends Task Force and the International Digital Publishing Forum’s Indexes Working Group, Adobe has heard the pleas of indexers to allow embedded indexing in InDesign to output linked indexes. In the Creative Cloud version, launched June 2013, InDesign can generate a linked index in multiple digital formats, including PDF, EPUB, and HTML.

The Indexing Society of Canada’s Judy Dunlop has done one project using the new Creative Cloud workflow, and she gave us an overview of what she learned through that experience. Dunlop had almost exclusively done back-of-the-book indexing for scholarly clients but decided to venture into digital index a couple of years ago. She took InDesign indexing workshops offered by Jan Wright and Lucie Haskins, and trained herself on InDesign through tutorials on

For indexers and publishers to work together on an embedded index in InDesign, said Dunlop, they need to use the same version. The Creative Cloud version is the only one that produces linked indexes from the embedded tags, and it is available by subscription only, so an indexer who doesn’t ordinarily have to use InDesign can easily subscribe to the program for a month, then cancel the subscription once the project is over.

In a typical workflow, the publisher would supply the indexer with the live InDesign files that have been edited and proofread. The indexer may embed tags directly in the InDesign file or create an index first in dedicated indexing software (such as CINDEX, SKY Index, or Macrex) then convert the locators into markers using a script available through Kerntiff Publishing Systems. InDesign’s index entries don’t include italics, bold, or decorations such as n for “note,” so the designer has to apply those styles manually. Every time the indexer revises the live file, the publisher has to regenerate the index and reapply special styles. Good communication—directly between designer and indexer—is key, said Dunlop. Designers who have traditionally been given a static index to typeset won’t be used to the process of regenerating and reformatting the index.

Many publishers will not have tried this workflow. Some haven’t yet moved to Create Cloud because of subscription costs. Further, many of them will be reluctant to relinquish control of their live files. As the indexer, if you are allowed to work with the live files, you have to be particularly careful not to make any inadvertent changes to them. (It’s theoretically possible to tag in Word and import into InDesign, but, Dunlop said, that feature is buggy and is generally not recommended.)

Allegedly, said Dunlop, you don’t need the publisher’s fonts to do the index, but on her project she found that the font mismatch caused problems. If the publisher offers you fonts, take them.

So far, Dunlop has found that force sort, indented vs. run-in style, and multiple levels of headings are features that work well in InDesign. However, the program doesn’t seem to handle cross-references well: not only are they not linked, but multiple cross-references are not rendered in the usual style (e.g., “See also Vancouver, BC. See also Kelowna, BC” rather than the preferred “See also Kelowna, BC; Vancouver, BC”), and generic cross-references have to be manually italicized. As mentioned, the designer also needs to apply special formatting, such as italics, bold, and decorated page numbers.

Because cross-references aren’t linked, Dunlop suggests double-posting instead. She also advocates being as succinct as possible, because long entries create unsightly breaks in EPUBs.

The linked-index functionality in InDesign Creative Cloud is so new that “everyone is learning,” said Dunlop. Publishers, editors, designers, and indexers will need to work together to figure out a system that works well for them. “Experiment—you’re not going to know what you’re going to get until you try it—then learn from your mistakes,” said Dunlop. Once you’ve got one project under your belt, you’re already in a better position than most and can share what you’ve learned with others.

If you have a client who is reluctant to try the workflow, Dunlop suggested that you offer to create an embedded index for a backlist title that still sells well and is available as an ebook. The risk to the publisher is lower than for a frontlist title with a tight deadline, and you can help them become familiar with the new indexing process.

Whither the ebook index?—Erin Mallory (ISC conference 2014)

Erin Mallory is the manager of cross-media at House of Anansi Press, which has been publishing ebooks (in addition to its print books) since 2009. Mallory launched the Indexing Society of Canada’s 2014 conference with an overview of the current state of ebook indexing workflows.

Ebook formats

Ebooks come in three main formats:

  • PDFs support some multimedia and interactivity and are easy to create but have limited sales channels. The static format of PDFs makes them popular for technical or reference books but may create poor reading experience for readers using certain devices (for example, trying to read on a smartphone).
  • EPUB is the most popular ebook format and is essentially a self-contained website, using XML and CSS. Text is reflowable. EPUB is a neutral, standard format compatible with all current e-readers except the Kindle. EPUB 2, still the most commonly used version, is based on HTML 4 and CSS 2. EPUB 3 is a newer format, with many improvements in functionality, accommodating languages that read vertically or from right to left, as well as MathML.
  • MOBI is also based on XML and CSS but is proprietary to Amazon and is compatible only with Kindle devices and apps.

The main reading engines are:

  • Adobe Reader Mobile SDK, which renders ebooks on Adobe Digital Editions, Kobo, and Nook.
  • WebKit, which renders ebooks on most mobile e-readers, including the iPad, and browser-based e-readers.

Publisher’s considerations

Ebook indexes are really only useful if they are fully hyperlinked. Until recently, hand coding each hyperlink was the only way to create a fully functional ebook index, so publishers had to consider the return on investment. Not only is creating an ebook index time consuming, but proofing the index adds time to the quality-assurance process.

Further, the publisher has to consider what devices its audience is using. First-generation Kindles and Kobos don’t support hyperlinking, and not all e-readers support a “back to” function.

Because of these limitations, Anansi decided when it launched its ebook program in 2009 not to include indexes in ebooks at all. Today, the publisher has adopted a workflow that has streamlined some aspects of ebook index creation.

Recent improvements

Scripts for Adobe Creative Suite 5+ can be very useful; some auto-generate cross-references in a formatted index that are maintained when exported to EPUB. The scripts aren’t perfect, so some (about half) of the links still need to be hand coded. These scripts use styles, so if a designer hasn’t properly styled the index, they won’t work properly.

There are also scripts that convert an external index (for example, one created in Word or a program like Cindex) to create an index in InDesign that is maintained on PDF export.

The Creative Cloud version of InDesign allows for linked indexes to be exported into EPUB. Publishers can be reluctant to relinquish control of their InDesign files to an indexer, but Mallory acknowledges that if professional indexers can save the time by embedding the index, publishers may have to push aside their reluctance and find ways of working with them.

Project considerations

For each project, ask yourself the following:

  • Does your ebook need an index?
  • Does the index have to match the print book?
  • What devices will your readers use?
  • Can the index be adapted to better serve the digital reading experience?
  • Can you change your indexing workflow to simplify the ebook index creation process?
  • What kind of markers do you want to use?

Mallory points out that in an ebook, using page numbers may not make the most sense. Some indexers in the audience remarked that seeing a page range communicates important information about subject coverage. InDesign indexes can allow the range to be listed but link only the first page number.

(On Day 2 of the conference, Judy Dunlop gave an excellent summary of the workflow she used in a recent project doing embedded indexing in InDesign Creative Cloud. Post coming soon!)


Christine Middlemass—Libraries in an evolving landscape (EAC-BC meeting)

Christine Middlemass, now the Vancouver Public Library’s manager of collection and technical services, has been at the VPL (recently named best library system in the world by researchers at Heinrich Heine University in Dusseldorf) for thirty-five years. Over that time she’s seen the library undergo massive change, and she joined us at the March EAC-BC meeting to give us a glimpse into that evolution, noting that developments are happening so quickly now that “what I tell you today will probably be different in a week.”

In the beginning, the VPL aimed to build a balanced collection at each branch. At that time, the time of the card catalogue, it wasn’t easy to know what was available at another location, so each branch was effectively independent. Print was king, with hardcover being the main format, and the library operated on a “just-in-case” basis, meaning the librarians had to anticipate what users would want. Back then, the library would also provide print-based reference services: “I, as the reference librarian, was the search engine,” said Middlemass. “It’s amazing to think about it now, but the information really was all in our heads.” The VPL’s focus was on a creating a product—this perfectly balanced collection—and the strategy “worked great for at least twenty of my thirty-five years.”

What changed? The short answer is technology: thanks to Google and the Internet, librarians don’t get the same number of questions. At the same time, they’re deluged in other ways, today having to consider the entire VPL system rather than focusing on an individual branch. They also have to review a mountain of information, including print catalogues, e-catalogues, databases, self-published authors, and many other sources, when adding to the collection. What’s more, the VPL is expected to buy the same content over and over again, in different formats: print, ebooks, audio, e-audio, DVD, Blu-ray—and all with more and more budgetary pressures. The library no longer owns much of its collection; licenses for ebooks and other electronic media are all different, and each has to be negotiated separately. For example, HarperCollins limits each ebook to twenty-six circulations, and Penguin offers licences limited to one year. “They’re making up their own minds about what they’re going to charge us. And they’re not always sharing the logic behind it.”

According to a January report from the Pew Research Center, 28 per cent of adults read an ebook in 2013 (up 5 per cent from the previous year). 47 per cent of those were under 30, and 17 per cent were over 65. “Part of my career was spent lobbying for quality books in large print,” explained Middlemass. But now people can simply bump up the font size on a tablet. At the VPL, ebooks make up 2 to 4 per cent of lending. Borrowing ebooks can be challenging: if the library has only one license for a book and someone else has borrowed it, you have to put a hold on it and sit on a waiting list for it to be available. Once you get it, you have a limited amount of time to read it before it evaporates off your device. “That doesn’t make a lot of sense to most of us. With a physical book, sure, but with a digital resource?” It doesn’t help that some devices use proprietary file formats and many vendors insist on bundling their content, offering books you want only in packages including a bunch of books you don’t want. Bundling is something librarians and advocacy groups like ReadersFirst are actively fighting. “I don’t want to be using taxpayer money to buy books that people won’t use,” said Middlemass.

These days a selections team of seven librarians oversee acquisitions for the entire system, although each branch still has its own profile that the team keeps in mind. A portion of the library’s collection is “floating”; some items don’t have a permanent home. The VPL also collaborates with other libraries in the Lower Mainland via InterLINK to share collections, and patrons can access other libraries’ collections with an interlibrary loan.

The VPL has shifted from offering a product to a service: librarians now aim to get you the material you’re looking for, when you need it—and the material that people request reflects the library’s changing community. In the early days, The VPL carried mostly English books, with some French; it now offers fourteen additional languages and are figuring out how to add more. The library has strived to balance public demand with acquisitions made based on positive critical reviews and embraces patron-driven acquisition, where, as Middlemass explained, “‘suggest a purchase’ meets interlibrary loan.” Knowing that one of its strengths is its collection of local books, the VPL is strengthening relationships with local publishers, including self-publishers. If the library finds out about a local self-published book, it will usually acquire a copy. “Some authors can be naive and end up spamming everyone at the library,” laughed Middlemass. “But some self-published books are very, very good.”

The VPL is experimenting with different ways of promoting reading, advocating for readers, and bringing readers and writers together, from holding workshops on writing and on self-publishing to hosting writer-in-residence programs and book clubs. It is also promoting its physical spaces, offering quiet places for users to work and read, as well as venues for groups to meet. By the end of this year, the VPL hopes to open its Inspiration Lab, a digital content space that will support users as they generate their own content.

What the heck’s happening in book publishing? (EAC-BC meeting)

Freelance writer, editor, indexer, and teacher Lana Okerlund moderated a lively panel discussion at the November EAC-BC meeting that featured Nancy Flight, associate publisher at Greystone Books; Barbara Pulling, freelance editor; and Laraine Coates, marketing manager at UBC Press. “There are lots of pronouncements about book publishing,” Okerlund began, “with some saying, ‘Oh, it’s doomed,’ and others saying that it’s undergoing a renaissance. What’s the state of publishing now, and what’s the role of the editor?”

Flight named some of the challenges in trade publishing today: publishers have had to scramble to get resources to publish ebooks, even though sales of ebooks are flattening out and in some cases even declining. Print books are also declining: unit sales are up slightly, but because of the pressure to keep list prices low, revenues are down. Independent bookstores are gone, so there are fewer places to sell books, and Chapters-Indigo is devoting much less space to books. Review pages in the newspaper are being cut as well, leaving fewer options for places to publicize books. The environment is hugely challenging for publishers, explained Flight, and it led to the bankruptcy just over a year ago of D&M Publishers, of which Greystone was a part. “We’ve all risen from the ashes, miraculously,” she said, “but in scattered form.” Greystone joined the Heritage Group while Douglas & McIntyre was purchased by Harbour Publishing, and many of the D&M staff started their own publishing ventures based on different publishing models.

The landscape “is so fluid right now,” said Pulling. “It changes from week to week.” There are a lot of prognosticators talking about the end of the traditional model of publishing, said Pulling. The rise of self-publishing—from its accessibility to its cachet—has led to a lot of hype and empty promises, she warned. “Everybody’s a publisher, everybody’s a consultant. It raises a lot of ethical issues.”

The scholarly environment faces some different challenges, said Coates. It can be quick to accept new things but sometimes moves very slowly. Because the main market of scholarly presses has been research libraries, the ebook issue is just now emerging, and the push is coming from the authors, who want to present their research in new ways that a book can’t really accommodate. She gave as examples researchers who want to release large amounts of their data or authors of Aboriginal studies titles who want to make dozens of audio files available. “Is confining ourselves to the book our mandate?” she asked. “And who has editorial control?”

Okerlund asked the panel if, given the rise in ebooks and related media, editors are now expected to be more like TV producers. Beyond a core of editorial skills, what other skills are editors expected to have?

“I’m still pretty old-fashioned,” answered Flight. “The same old skills are still going to be important in this new landscape.” She noted an interesting statistic that ebook sales are generally down, but ebooks for kids in particular have fallen 45% in the first half of 2013. As for other ebook bells and whistles, Greystone has done precisely one enhanced ebook, and that was years ago. They didn’t find the effort of that project worth their while. Coates agreed, saying “Can’t we just call it [the enhanced ebook] a website at this point? Because that’s what it really is.” Where editorial skills are going to be vital, she said, was in the realm of discoverability. Publishers need editors to help with metadata tagging and identifying important themes and information. Scholarly presses are now being called upon to provide abstracts not just for a book but also for each chapter, and editors have the skills to help with these kinds of tasks.

Pulling mentioned a growing interest in digital narratives, such as Kate Pullinger’s Inanimate Alice and Flight Paths, interactive online novels that have readers contribute threads to the stories. Inanimate Alice was picked up by schools as a teaching tool and is considered one of the early examples of transmedia storytelling. “Who is playing an editors’ role in the digital narrative?” asked Pulling. “Well, nobody. That role will emerge.”

Okerlund asked if authors are expected to bring more to the table. Flight replied, “Authors have to have a profile. If they don’t, they are really at a huge disadvantage. We’re not as willing to take a chance on a first-time author or someone without a profile.” Pulling expressed concern for the authors, particularly in the “Wild West” of self-publishing. “What happens to the writers?” she asked. In the traditional publishing model, if you put together a successful proposal, the publisher will edit your book. But now “Writers are paying for editing. Writers are being asked to write for free. They need to be able to market; they need to know social media. It’s very difficult for writers right now. Everybody’s trying to get something for nothing.” She also said that although self-publishing offers opportunity in some ways, “there’s so much propaganda out there about self-publishing.” Outfits like Smashwords and Amazon, she explained, have “done so much damage. It’s like throwing stuff to the wall and seeing what sticks, and they’re just making money on volume.”

Pulling sees ethical issues not only in those business practices but also in the whole idea of editing a work to be self-published, without context. “It’s very difficult to edit a book in a vacuum,” she said. “You have to find a way to create a context for each book,” which can be hard when “you have people come to you with things that aren’t really books.” She added, “Writers are getting the message that they need an editor, but some writers have gotten terrible advice from people who claim to be editors. Book editing is a specialized skill, and you have to know about certain book conventions. Whether it’s an ebook or a print book, if something is 300,000 words long, and it’s a novel, who’s going to read that?” A good, conscientious book editor can help an author see a larger context for their writing and tailor their book to that, with a strong overall narrative arc. “It’s incumbent upon you as a freelancer to educate clients about self-publishing,” said Pulling. Coates added, “We have a real PR problem now in publishing and editing. We’ve gotten behind in being out there publicly and talking about what we do. The people pushing self-publishing are way ahead of us. I think it’s sad that writers can’t just be writers. I can’t imagine how writing must suffer because of that.”

Both Flight and Pulling noted that a chief complaint of published authors was that their publishers didn’t do enough marketing. But, as Pulling explained, “unless it’s somebody who is set up to promote themselves all the time, it’s not as easy as it looks.” Coates said that when it comes to marketing, UBC Press tries everything. “Our audiences are all over the place,” she explained. “We have readers and authors who aren’t on email to people who DM on Twitter. It’s subject specific: some have huge online communities.” Books built around associations and societies are great, she explained, because they can get excerpts and other promotional content to their existing audiences. She’s also found Twitter to be a great tool: “It’s so immediate. Otherwise it’s hard to make that immediate connection with readers.”

Okerlund asked the panel about some of the new publishing models that have cropped up, from LifeTree Media to Figure 1 Publishing and Page Two Strategies. Figure 1 (started by D&M alums Chris Labonté, Peter Cocking, and Richard Nadeau), Pulling explained, does custom publishing—mostly business books, art books, cookbooks, and books commissioned by the client. Page Two, said Pulling, is “doing everything.” Former D&Mers Trena White and Jesse Finkelstein bring their clients a depth of experience in publishing. They have a partnership with a literary agency but also consult with authors about self-publishing. They will also help companies get set up with their own publishing programs. Another company with an interesting model is OR Books, which offers its socially and politically progressive titles directly through their website, either as ebooks or print-on-demand books.

The scholarly model, said Coates, has had to respond to calls from scholars and readers to make books available for free as open-access titles. The push does have its merits, she explained: “Our authors and we are funded by SSHRC [the Social Sciences and Humanities Research Council of Canada]. So it makes sense for people to say, ‘If we’re giving all this money to researchers and publishers, why are they selling the books?'” The answer, she said, lies simply in the fact that the people issuing the call for open access don’t realize how many resources go into producing a book.

So where do we go from here? According to Pulling, “Small publishers will be okay, as long as the funding holds.” Flight elaborated: “There used to be a lot of mid-sized publishers in Canada, but one after another has been swallowed up or gone out of business.” About Greystone since its rebirth, Flight explained, “We’re smaller now. We’re just doing everything we’ve always done, but more so. We put a lot more energy into identifying our market.” She added, “It’s a good time to be a small publisher, if you know your niche. There’s not a lot of overhead, and there’s collegiality. At Greystone we’ve been very happy in our smaller configuration, and things are going very well.”

Pulling encouraged us to be more vocal and active politically. “One of the things we should do in Vancouver is write to the government and get them to do something about the rent in this city. We don’t have independent bookstores, beyond the specialty stores like Banyen or Kidsbooks. And at the same time Gregor Robertson is celebrating Amazon’s new warehouse here?” She also urged us to make it clear to our elected representatives how much we value arts funding. One opportunity to make our voices heard is coming up at the Canada Council’s National Forum on the Literary Arts, happening in February 2014.

Use hyphens wisely: Discretion is advised

Having just educated two of my designer friends—both award-winning veterans of the book industry—about the discretionary/optional hyphen, I realized that maybe not everyone knows about it after all. Convincing designers to embrace the discretionary hyphen can mean saving a lot of proofing time (or, at the very least, eliminating a proofing worry), so I’ve found myself proselytizing, and I might as well do that here, too.

What they are

You’re familiar with the good ol’-fashioned regular hyphen (like the one in “ol’-fashioned”), also known as the hard hyphen. If a line breaks after a hard hyphen, it’s no big deal. In contrast, you wouldn’t want a line break after the hyphen in a phone number, say, or a numeral-unit adjective (e.g., 4-ton jack), and in those situations you’d want to use a nonbreaking hyphen.

But let’s say you’re reading a proof where a word has broken where you don’t want it to break—e.g., mi•crowave instead of micro•wave. What happens when you mark up the proof asking the designer to rebreak the word?

Well, the way many designers have been told to solve the problem is simply to add a (hard) hyphen where they want the break to happen. The approach seems to resolve the issue, but it’s not an elegant fix. What they should be using is a discretionary hyphen (Ctrl/Command + Shift + – in InDesign), which appears if the word breaks at the end of the line but remains invisible when it doesn’t.

Let’s say the designer has added a hard hyphen to “microwave” to make it break as


If you made text changes that pushed “micro” to the following line, for example, you’d end up with “micro-wave” on one line, and the proofreader would have to ask for that hyphen to be deleted.

Using a discretionary hyphen would mean that “microwave” would continue to break as


if it flowed over two lines but appear as “microwave” otherwise.

(Apparently, if you add a discretionary hyphen before a word, InDesign prevents that word from being broken at all—handy for some proper nouns. More information about hyphens in InDesign can be found here.)

Why they help

Beyond the fact that the proofreader no longer has to worry about designer-introduced hard hyphens, discretionary hyphens are especially helpful for texts that are destined for more than one format or medium. Many publishers create their ebooks from their InDesign files, and because EPUB text can reflow, hard hyphens introduced to break a word in a desirable place for the print edition are bound to show up where they aren’t needed in the ebook. Either a proofreader has to go through the ebook text and remove them, or the publisher leaves them in and effectively sacrifices some of its editorial standards in its ebooks. Similarly, reprints (e.g., when a hardcover is reformatted as a mass-market paperback) would be a lot less work for the proofreader if designer-introduced hard hyphens were no longer a concern.

What they could mean to editors

We could nip the problem in the bud a bit earlier in the production process if copy editors also used discretionary hyphens (called optional hyphens in Microsoft Word—shortcut key: Ctrl/Command + -) after common prefixes in closed compounds. (As if copy editors needed any more responsibility!) It’s probably impossible to anticipate every possible bad word break, but a few global searches would be fairly easy to do at the copy-editing stage and would eliminate a lot of the distraction for the proofreader.

What to keep in mind

Ideally, all optional hyphens in Word would translate seamlessly into discretionary hyphens in InDesign. Apparently the two programs don’t always play nicely together, though, so if you’re a copy editor prepping a file for design, it might be worth sending a few test files to the designer you’re working with, to figure out if the special characters, including nonbreaking spaces, nonbreaking hyphens, and discretionary hyphens, among others, will come through.

Also, discretionary hyphens may cause problems for online text because different standards treat them differently, some translating discretionary hyphens into hard hyphens. Again, you may want to test some files, particularly in an ebook workflow, to see if inputting discretionary hyphens is worth the copy editor’s time or if they should be inserted by the designer and only as needed for the print publication. Luckily, designers can just as easily search an InDesign file for discretionary hyphens they’ve inserted and remove them for the ebook version.

How you can make the world a more discretionary place

Next time you’re proofreading and you notice one of those manually added hyphens that buggers up a word, just mention discretionary hyphens to the designer. The designers I spoke to were happy to learn about them and were excited about the prospect of saving proofreading time and, more importantly, not inadvertently introducing errors.

Pilar Wyman—Metadata, marketing, and more (ISC conference 2013)

Pilar Wyman (@pilarw on Twitter) is the immediate past president of the American Society for Indexing, as well as a member of the ASI’s Digital Trends Task Force, and she spoke at the ISC conference about promoting indexes as metadata and showing our clients how our indexes can be used as sales tools for their books.

We’re used to thinking of a book’s metadata as information about the book as a product—its title, author, ISBN, etc.—but a book’s index can also serve as metadata: each index heading and subheading can be thought of as a tag for a chunk of text that we want readers to see. As a result, readers can use this metadata to provide them with a filtered view of the content that reveals specific facets or dimensions of a book.

Indexes, Wyman argues, are as important for ebooks as a search function. They

  • add browsability and help readers find what they need by expanding the number of access points to content
  • serve as a navigational tool
  • offer pre-analysis: indexes give readers a good sense of the range of topics covered and the importance of each
  • provide a conversation with the reader, allowing publishers to show what their product has to offer

Wyman advocates giving away a book’s index for free (as Amazon essentially does with its Look Inside feature) as a marketing strategy, to let readers know what they could be getting. She also showed us the potential of index mashups, in which you combine the indexes of several publications in a collection, allowing users to browse or search across all of them. These mashups could be enormously useful for “scrapbook files”—collections of content from a variety of sources, as you’d find in a university course pack, for example.  Each heading in the mashed-up index is a link, taking you either directly to the content or to a summary screen of available information, with context. Most importantly for publishers, these indexes would offer users a direct link to purchase any of the books included in the mashup.

To exploit this marketing potential of ebook indexes, whether they are standalone indexes or mashups, publishers should link them—both in to the content and out to further resources or places to buy the book. These linked indexes should be included as back-of-the-file chapters or, better yet, in the front of an ebook so that the index gets searched first. For usability, the index should be accessible wherever you are in the book (just as you can flip to the back of a print book anytime you want), and the “find” tool should bring up the best hits, as identified by the index. Results should show snippets of a term in context, and cross-references should help the reader refine their search terms.

Generic cross-references can often present a dilemma for the indexer (e.g., Does See specific battles really give readers the information they need?), but Wyman’s vision for the EPUB index eliminates this problem: “specific battles” would link to a list of those battles, which would in turn link to the corresponding headings in the index. She also adds that smart use of tagging would allow you to filter not only based on concept but also type of content. For example, many of us already indicate definitions with boldface, images with italics, etc. This “decoration metadata,” as Wyman calls it, can be another layer of information that users can use to narrow their search down to what they need. Wyman also introduced the concept of a reverse index: users can highlight a section of text and discover what terms in the index are associated with it, allowing them to easily jump to other places in the text that discuss the same topic.

As indexers, Wyman said, we’re already skilled at figuring out aboutness and can easily apply those skills, especially if we’re already familiar with embedded indexing, to semantic tagging of text. If we can persuade our clients of the value of using our indexes as a sales tool, we can further leverage our expertise.


(My take: I think the idea of index mashups is brilliant. My colleagues who work in academic publishing spend huge amounts of time compiling different catalogues for different subject areas and markets. Offering one index mashup of all of their Aboriginal studies titles and another for their women’s studies titles, for example, could allow them to show the breadth and depth of their list to particular target markets, including academics considering course adoptions and subject-specific libraries.)

Book review: Book Was There

As a professor of literature at McGill University, Andrew Piper is, in essence, a professional reader, and he brings this experience to his latest book, Book Was There: Reading in Electronic Times (University of Chicago Press), in which he offers a very personal meditation on our evolving relationship with reading. In what ways is a physical book more than its content? How have screens and digital technologies changed the way we understand, interact with, and share texts? Writes Piper in the book’s prologue,

This book is not a case for or against books. It is not about old media or new media (or even new new media). Instead, it is an attempt to understand the relationship between books and screens, to identify some of their fundamental differences and to chart the continuities that might run between them. (p. ix)

Their “fundamental differences” dominate the content of the book, since the similarities between books and screens are perhaps more obvious. Piper wonders whether, in the era of ebooks, we have become a culture of skimmers—the slipperiness of digital text and the immediacy of the page turn on e-reading devices mean that we don’t give ourselves the chance to digest what we read. Exacerbating this problem is a glut of content:

We have entered into an exponential relationship to the growth of reading material. Like many parents or educators, I worry that the growing expanse of reading pulls us apart, not just socially, but also personally. The incessant insistence on the functionality of reading—that there must be some “value” to it—only amplifies this problem. When there is so much more to read and when we are always reading for some purpose, we are only ever “catching up.” (p. 129)

The social aspect of reading is key to Piper’s book. Reading in itself is an act of isolation, yet by reading, we develop a common culture that socializes us. We catalyze that socialization by sharing what we read, and Piper notes that whereas sharing a printed book is a meaningful act—not only do you give something up to another reader, but by doing so you also make public your appreciation and endorsement for that book—sharing digital content may not carry the same weight:

If I do not have any collection of digital files in the same way as my books, will I be able to give them away in the same manner? When I pass down my books to my children, I imagine I will be sharing with them as sense of time. Books are meaningful because as material objects they bear time within themselves. (p. 107)

As much as Piper attempts to remain above the fray in the oft-promoted argument that digital reading and ebooks spell the death of print, his text is tinged throughout with nostalgia for print books—or, at least, for their former primacy as the authoritative sources of human knowledge. He argues that each time a new technology replicates a book’s functionality in terms of conveying content, we try to load it with features that replicate other aspects of a printed book, whether it be the ability to hold it in our hands, turn pages, or jot down marginal notes. The relationship between print and digital texts, however, doesn’t have to be antagonistic; their co-existence could have benefits to understanding:

The use of multiple channels—speech, scroll, book—is the best guarantee that a message will be received, that individuals will arrive at a sense of shared meaning. Like the book’s ability to conjoin the different faculties of touch, sight, and sound into a single medium, according to the tradition of the Codex Manesse the book itself is imagined to reside within a more diverse ecology of information. When we think about media death, about the idea of the end of certain technologies, we do well to remember this medieval insistence on the need for redundancy, the importance of communicating the same thing through different channels. (p. 7)

As much as he is a champion of the printed book (downplay that fact as he may), Piper acknowledges that some digital technologies can be used to enhance our understanding of text. I like this example:

Feature Lens, which was developed by the Human–Computer Interaction Lab at the University of Maryland, is a program that allows you to view meaningful semantic patterns within large structures of texts. For Tanya Clement, who undertook an analysis of Gertrude Stein’s famously difficulty and repetitive novel The Making of Americans (1925), the interface revealed a range of structural patterns so far unnoticed by readers. (p. 141)

Although Book Was There raises some interesting points, I didn’t find any of what Piper wrote particularly revelatory, and because the text was so personal to him, there were portions to which I just couldn’t relate and that made me feel disengaged from the book. Maybe I’m not well-read enough to appreciate the author’s many literary references, but their liberal use, often in places where I would have found historical or scientific examples more persuasive, made some of his conclusions seem tenuous. He reads a lot into shared terminology between the print and digital worlds—pages and page views, the faces of Facebook and the faces of typefaces, etc.—extracting meaning where I simply see a metaphor to allow users to relate to new technologies. Although I appreciate Piper’s enthusiasm for this topic, I would stop short of recommending this book to anyone looking for objective and rigorous insight into the act and consequences of reading. Instead I’ll cross my fingers and hope that Alberto Manguel will one day decide to update A History of Reading to encompass the digital age.


Lara Smith gave a captivating and hugely informative presentation about ebooks at Wednesday’s EAC-BC meeting. Having gone to Greg Ioannou’s conference talk about e-publishing, I wondered if there’d be a lot of overlap in the content of the two talks. There wasn’t—and after the meeting BC Branch Chair Peter Moskos suggested to me that Lara probably had enough material to fill a full seminar.

Ebooks are often thought to be electronic versions of print books, Lara began, but many titles today are just born digital. Ebooks come in two main formats: PDF and EPUB. The ebook PDFs aren’t just your regular PDFs—they’re Universal PDFs, which are optimized for screen viewing. Chapters are bookmarked, the table of contents is linked, URLs are live, and the files include some metadata.

In the early days of ebooks, there were many different ebook formats; every e-reader developer wanted to create a device with a proprietary format, which led to a very fractured market. The International Digital Publishing Forum set out a standard known as EPUB—a set of rules that everyone could follow to build an ebook. All devices now have the capacity to read EPUB files. We’re not sure what the future will be for EPUB, though, because device manufacturers still like to add on proprietary bells and whistles to their EPUB files.

EPUBs can have fixed layouts or be flowable. Fixed-layout EPUBs look a bit like PDFs, but they have a lot more capability behind the scenes (e.g., accessibility features like text to speech). They’re much more complicated to create. EPUBs are good for visual books, such as coffee-table books or cookbooks, but they’re really meant to be read on a tablet device. Lara demonstrated how impractical it is to read a fixed-layout EPUB on a smartphone.

By contrast, flowable EPUBs can be read on a phone—not to mention e-readers and browsers—since the type can be enlarged as needed. Flowable EPUBs make up the bulk of the ebooks out there.

An EPUB, Lara explained, is really just a ZIP file. Change the epub extension to zip, and you can decompress the folder to see what’s inside. There may be a folder for images, and the text is broken up into chapters, each an HTML file. There’s a style sheet that controls how the tagged text looks to the human reader. She’s found the best strategy to ensure that the ebook looks good on all devices is to keep styling to a minimum. “We’re not trying to replicate the print book,” she said. “We really have to reconceptulaize it. We can’t control type in the same way.”

Lara works mostly with books that are destined for both print and digital, so she exports from InDesign. But she notes that you can build an EPUB from scratch in a text editor, and there’s conversion software that will transform Word files into EPUBs (although they don’t look very good). The simpler your original files, she said, the better it will look. (For example, never justify your text; on many devices, the text will look hideous and gappy.)

When publishers convert books to EPUBs, they have the option of using a conversion service, which is inexpensive and may be appropriate for converting large numbers of files (e.g., the publisher’s backlist), but the results can look pretty rough. Another option is in-house conversion, which allows for more control over quality, style, and timelines but requires an investment into a dedicated individual or team of people who must learn how to use the software and prepare the files for the market. Editors working with individual authors to create single ebooks may be able to dedicate more resources to fine-tune the EPUBs themselves to specific devices and take full advantage of enhancements like audio and video.

Lara also mentioned vendor conversion tools, including iBooks Author, Kindle Direct Publishing, and Kobo Writing Life, which are free tools to use but restrict you to selling within those particular streams, and DIY options (what she referred to as “device-agnostic options”), such as Smashwords, PressBooks by WordPress, and Vook, which charge for creating the ebooks, whether through an upfront fee or through royalties. She noted that all of these options have a learning curve and a real cost.

Once you’ve got your ebook made, you then have to sell it. How are people going to find it? The answer is metadata—information attached to your book including title, author, publisher, ISBN, price, description, author bio, reviews, etc.—that will populate distributors’ and retailers’ databases. Metadata is key to discoverability.

Lara then moved on to the contentious issue of digital rights management (DRM), which puts a lock on EPUBs file and prevents copying, editing, and reselling but also limits legitimate sharing of books and device switching. It pits readers’ freedoms against authors’ and publishers’ right to profit. The debate seems to be heading in two directions: digital media may be licensed to readers (where they can read but don’t actually own the book), or publishers may decide not to use DRM at all. (O’Reilly Media, in fact, has declared that it won’t be using DRM on any of its books.)

Another issue facing publishers is that EPUBs have the capability to incorporate a variety of assistive technologies, such as text to speech, alternative text, phonetic text, media overlays, dyslexic reading aids, conversion to braille, etc., and international accessibility organizations are pushing publishers to include all of these features. Of course, for the publisher, doing so means a lot more investment into editorial and production resources.

Lara was careful to note the distinction between apps and ebooks. Apps are self-contained applications, and they can be interactive and include all sorts of multimedia features. There are book apps—kids’ books work really well as apps, because they don’t have a lot of content but can support a lot of interactivity. Apps take more development than an ebook, and you need to involve a programmer.

So what are the editorial concerns surrounding e-publishing? First, the publisher must have the digital rights—including for the images that are to appear in the book. Next, the publisher should look at the content and figure out the best way to present the book (fixed or flowable) and decide whether to add enhancements.

Challenges for ebook publishers are elements like sidebars, which you want to place at section or chapter breaks so that they don’t interrupt the flow of the text. Lara noted that ebooks are read in a linear way; it becomes tedious to have to skip over what could turn into pages of sidebar content to get back to the main text, especially if you’re reading on a small screen. Footnotes are also a problem, because the foot of a page is no longer well defined. Indexes are similarly challenging. (See my summary of Jan Wright’s discussion of ebook indexes from this past spring’s ISC conference.)

On the flip side are the many advantages that ebooks offer. For example, endnotes can be linked, as can in-text references. Photo sections can go anywhere within the book, not necessarily just between printed signatures. You can make URLs in the book (and the references, especially) live, and you can add audio or video enhancements. Finally, there are no page limits, and you can really play around with the concept of what a book is. Lara warns, however, that the more fun stuff you put in, the greater the risk that something will break, and broken links or videos, for example, can frustrate readers.

Lara’s talk was phenomenal. I learned a huge amount, though I will probably eventually have to resign myself to the fact that she knows more about e-publishing than I ever will.