I recently had the good fortune of spending a few days with one of the leading production houses in South East Asia. This is a firm that’s big into ebook conversion and processes literally thousands of books a week from print files to ebooks in various formats. What amazed me most about my time with the conversion team was the volume of brand new titles going through this process.
At Out of House Publishing we’re used to operating an XML first workflow which we use to output XML, HTML and various ebook formats at almost any stage in the production cycle. That so many publishers are still converting back list is baffling to me, so I thought I’d highlight four big advantages of running a combined digital and print workflow, rather than converting post-production:
1. Evolving digital products alongside print helps publishers build in greater interactivity and linking to their digital files, and most importantly gives them more time to check and validate these features. Our copy-editors help us check linked cross references to make sure we hit maximum accuracy.
2. Even the very best conversions throw up some errors, be it with hyphenation turnovers, placement of figures or scrambling of special characters. Again, running digital production with print helps publishers iron out these things early on and avoids conversion errors creeping in under the radar post-production.
3. Authors, editors, marketeers and almost anyone else can be involved in digital product development. Seeing digital products emerge iteratively helps us all identify new opportunities for enhanced features, additional content and marketing ideas. These opportunities are not typically afforded to publishers converting back list titles in large batches.
4. Time to market can be faster. Simultaneously output digital and print files and publishers can have digital product out in the marketplace weeks before printed books. And with evidence that digital products help drive print sales this can drive revenue across the piece.
Don’t get me wrong, ebook conversion continues to do wonders for our industry – helping bring backlist to life and improving revenue in a tightening market. I just think there’s a better way for handling new content.
What other advantages do you see from running digital files alongside print? Are you relying on back list conversion and find it works for you – what are the major advantages over XML first?
Google indexes our virtual world. The Google bot makes sure our search results are relevant and accurate. It’s a fascinating system, one you can read more about over at the Google Guide.
In the publishing industry we pay people to index our content. It’s a practice that probably dates back to the Greek and Roman times (Gary Forsyth writes more on this for the American Society for Indexing here). Now that we’re reading more digital content than ever, publishers are starting to question the value of the index. An ereader can search the book right? Readers can use the search box to find any term they want. Well yes, but does that really offer the same value as an index crafted by hand or algorithm? Are the results relevant and accurate, in other words do they take the reader where they want to go? A simple word search of a book on the orchestra might bring up tens of occurences of the word “violin”, hundreds even. But how many point to the really pertinent stuff – the places in the content where the author describes the violoin, defines its role in the orchestra and its history? Maybe half a dozen at most.
My own view is that producing ebooks with smart indexes – taking the reader right to the point in the text where their query is discussed – enriches our content and sets the publishing world apart from other content providers.
As the Google Guide says “PageRank is Google’s system for ranking web pages. A page with a higher PageRank is deemed more important and is more likely to be listed above a page with a lower PageRank”. To make book content, particularly non-fiction, truly accessible to readers we need a similar system in every book. We have it in fact – it’s called the index. Until we have really smart algorithms and tools for intellgient search installed on our devices my bet is on the trusty indexer continuing to guide us through to the content we actually need, when we need it.
At Out of House we understand XML workflows and use them to produce a smart hpyerlinked index in the back of many of the books we produce. Contact us to find out more.
The epub logo from IDPF
EPUB is the open access, device independent ebook format being widely adopted across the publishing world. Here are three things you might not already know about EPUB files:
- Under the bonnet an EPUB file is a ZIP file containing mostly XHTML along with image, metadata and indexing files that draw everything together. Copy your EPUB, rename it as a a ZIP and take a look inside. You might think of an EPUB file as being a bit like an InDesign package: there’s a single index bringing all of the constituent parts together. If you’ve converted your backlist content to EPUB then remember that in your ZIP file will be all the constituent parts of your book that (rights permitting) you can store in your Digital Asset Management system for reuse elsewhere.
- You can open EPUB files in a standard browser. It is just HTML after all. I use EPUBReader for Firefox.
- EPUB files carry their own metadata. This enables retailers and aggregators to read, catalogue and index content without the need for additional data from the publisher. An EPUB is a neat little package with everything it needs to get your content out to the market.
Out of House Publishing provides an excellent backlist to EPUB conversion service. More importantly, we understand the value in adopting structured content early on in the production process. Our XML first workflows mean that EPUB, XML and other digital outputs are delivered seamlessly along with print files.
Why continue converting print to digital when you can run both together? Contact us to find out more.
Today’s Apple announcement about its new e-book publishing platform and tools could well be the gamechanger we’ve long been expecting.
These tools all require digital content of course, and properly structured content has to be the key to really take advantage of the opportunities of digital publishing. Converting your backlist to structured content like XML and EPUB can really help you unleash the value sitting in your PDF and paper assets. And there are plenty of other platforms and formats out there – EPUB is still very much alive and well.
Use XML to future proof your content and you’ll be ready for the next big publishing announcement!
Contact us now to go digital!
Many publishers are adopting XML in their production workflows. Indeed, in the journals industry XML is pretty much ubiquitous. Jo Bottrill gives five good reasons for adopting XML into a production workflow.
- Enrich content. Use your XML coded content to turn flat text into rich web/device ready content – from simple links between elements such as tables, references and so on, to smart embedded indexes and links out to external content – XML can really bring your content to life and make it more accessible.
- Repurpose content. Switch on multi-channel publishing at the touch of a button – with content encoded in XML against a well established DTD, your content can quickly be output tailored for any device or format (ePub on a web browser for example). And, with an extensive, well constructed repository of XML content, publishers can more easily meld material from different sources to produce new products and serve new niche markets.
- Improve content. With a greater focus on content structure and taking a consistent approach across various product streams, an XML workflow can help improve the quality of content, presenting ideas in a more consistent pattern with sensible hierarchies.
- Future proof content. XML is the native format for holding content and that’s unlikely to change any time soon. With everything properly coded up in XML a publisher is in a perfect position to quickly take advantage of new ways of reading, new methods for enhancing data and new markets.
- Increase sales.Ultimately all of this helps publishers diversify their revenue streams. By offering high quality, rich content, tailored to a niche market and delivered via various platforms and devices the scope for improving sales per unit of content increases significantly.
Contact us to find out more about how Out of House Publishing can help you adopt XML into your production process.
We use XML to produce books and digital content for our clients. We’re not XML geeks, we don’t edit in XML nor do we spend hours pouring over pages of code. But, our workflow is centred around XML – it drives the production process, including editing and indexing.
The emdedded indexng workflow we use enables our clients to produce a smart index in their ebooks.
We ask our indexers – and authors indexing their own books – to index in Microsoft Word, using either the inbuilt indexing tool or a third party system. For professional indexers we recomend James Lamb’s Word Embed, which ties in well with existing indexing software such as Cindex or Macrex.
The digitally indexed files are combined with the copy-edited content, everything merging into one XML repository complete with active links between an item in the index and the relevant locations in the content. For the print product the page numbers are dynamic so they can be updated if the text is reflowed. For electronic products the page numbers can be substituted for hyperlinks back to the relevant locations in the text.
Not only does this linking provide rich XML which publishers can use to help their readers mine content – it also helps the production process. Gone are (some) of the barriers to repagination part way through a project. Repurposing content for a new edition or mashing content from various sources? Take the index tags with you and retain a valid index.
Contact us if you’d like to know more about how embedded indexing can be a part of your digital publishing strategy.
var _gaq = _gaq || ;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script'); s.parentNode.insertBefore(ga, s);
Tomorrow I shall be hosting a seminar for some colleagues to look at structuring content for copy-editing using Microsoft Word styles. I ran a similar session last year and it was well received. I aim to break down some of the mystery about copy-editing in Word, show some examples onscreen and give my colleagues some confidence to get on and have a play in Word.
We use templates and styles in Word whenever we can. This improves the consistency and structural integrity of the content we process, and it makes our copy-editing and typesetting processes more efficient (copy-editors aren’t labouring away keying in codes and typesetters aren’t spending hours manually mapping those to their stylesheets). We can format particular types of content to make them easier to process (setting briefing notes in a different colour for example). This all helps shift focus to the structural integrity of the content entrusted to us rather than on the minutiae of formatting (the distinction between a the Word style applied to some text and the details of its format – bold, italic, font etc. – is important to understand).
Getting the hierarchy right is particularly important when you start to think about XML (as most publishers are now doing). Mapping a template and suite of styles to the DTD (document type definition – the XML rules against which the structure of your content is tested) for your content gives a straightforward way of validating against the DTD without having to train editors in the intricacies of XML mark-up.
What tips do you have for marking up onscreen content for typesetting?