smoores.dev - Announcing: @smoores/epub

Announcing: `@smoores/epub`

Dec. 13, 2024

Like most software projects, Storyteller stands on the shoulders of many open source giants. It relies on Echogarden, whisper.cpp, and FFmpeg for audio processing and automated audio transcription. It relies on React, Next.js, React Native, and Expo for UI and app development. It relies on a massive ecosystem of open source programming languages, development tools, and libraries. And without them, it simply couldn’t exist. Storyteller’s relatively rapid development and increasingly full feature set is only possible because of these other free, open source software projects, some of which seem to have completely solved the problems of their domain.

But when I started developing Storyteller, there was a hole in this ecosystem that was surprising to me. Storyteller publications are based on the EPUB specification, which is itself an open source project, maintained by the W3C. It’s a good spec! And yet, there seemed to be very few open source libraries dedicated to working on EPUB publications. Initially, Storyteller’s backend was primarily written in Python, a language with a lively, wide-ranging open source ecosystem. At the time, the only contender in the space was EbookLib. This is a really neat library, but it wasn’t designed for Storyteller’s use case — modifying existing EPUBs — and it made some assumptions that made it essentially impossible to use. At the time, it was fully in maintenance mode, though it seems like that may be changing.

When I re-wrote Storyteller with Node.js (something that was truly only possible due to the outstanding Echogarden library), I found the situation even more dire. There was a larger quantity of packages, but nearly all of them were abandoned, and none of them provided the flexibility that I needed for Storyteller.

Since working with EPUB 3 files is Storyteller’s primary domain, I decided it was worth the effort to build my own EPUB library within Storyteller. It began as a very low-level library, as Storyteller largely needed to modify the XHTML representation of chapters and add metadata to link chapters to media overlays. Eventually, I added higher level APIs for ergonomically manipulating publication metadata, like titles and authors. I think that this approach will, in the long run, be a recipe for this library’s success — as I mentioned before, many other EPUB libraries attempted to abstract away the underlying EPUB structure, resulting in inflexible APIs that didn’t meet Storyteller’s needs.

Afer working on this library for several months, a colleague reached out to ask for advice on a project they were working on. They wanted to be able to generate EPUB publications programmatically, but they were running into the same issues that I had. The landscape for open source EPUB libraries in Node.js consisted of mostly unmaintained or incomplete projects.

I knew what I had to do. So for the past few weeks, I’ve been pulling Storyteller’s EPUB code out into its own library, adding documentation, and cleaning up the public-facing API. I think it’s ready to share with the world, so here it is: @smoores/epub

What does it do? And what exactly is an EPUB, anyway?

An EPUB file is a ZIP archive with a partially specified directory and file structure. Most of the metadata and content is specified as XML documents, with additional resources referenced from those XML documents. The most important of these documents is the package document.

The package document is an XML document that consists of a set of elements that each encapsulate information about a particular aspect of an EPUB publication. These elements serve to centralize metadata, detail the individual resources, and provide the reading order and other information necessary for its rendering.

@smoores/epub is primarily concerned with providing access to the metadata, manifest, and spine of the EPUB publication. Metadata refers to information _about_ the publication, such as its title or authors. The manifest refers to the complete set of resources that are used to render the publication, such as XHTML documents and image files. And the spine refers to the ordered list of manifest items that represent the default reading order — the order that readers will encounter the manifest items by simply turning pages one at a time.

Here are some examples on how it can be used:

fromFile.ts

/**
 * You can read from an existing EPUB publication file
 */
import { Epub } from "@smoores/epub"

const epub = await Epub.from("path/to/book.epub")
console.log(await epub.getTitle())

fromScratch.ts

/**
 * You can also construct an EPUB from scratch
 */
import { randomUUID } from "node:crypto"

import { Epub } from "@smoores/epub"

const epub = await Epub.create({
  title: "S'mores For Everyone",
  // This should be the primary language of the publication.
  // Individual content resources may specify their own languages.
  language: new Intl.Locale("en-US"),
  // This can be any unique identifier, including UUIDs, ISBNs, etc
  identifier: randomUUID(),
})

fromScratch.ts

/**
 * You can modify the in-memory EPUB instance however you need,
 * and then write it back to disk
 */
import { Epub, ManifestItem } from "@smoores/epub"

const epub = await Epub.from("path/to/book.epub")

// Construct a manifest item describing the chapter
const manifestItem: ManifestItem = {
  id: "chapter-one",
  // This is the filepath for the chapter contents within the
  // EPUB archive.
  href: "XHTML/chapter-one.xhtml",
  mediaType: "application/xhtml+xml",
}

// You can specify the contents as a string
const contents = `<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:epub="http://www.idpf.org/2007/ops"
      xml:lang="en-US"
      lang="en-US">
  <head></head>
  <body>
    <h1>Chapter 1</h1>
    <p>At first, there were s'mores.</p>
  </body>
</html>`

// First, add the new item to the manifest, and add
// its contents to the publication
await epub.addManifestItem(manifestItem, contents, "utf-8")

// Then add the item to the spine
await epub.addSpineItem(manifestItem.id)

// Finally, write back to disk
await epub.writeToFile("path/to/modified.epub")

If you’re like me, that last example may have raised an eyebrow. I’m not the biggest fan of templating XML/XHTML strings, especially if the content may need to be conditional or repetitive. Storyteller, in particular, needs to be able to wrap each sentence in a span tag, without breaking any existing markup on the text. So in addition to supporting byte arrays and unicode strings as encodings for manifest items, the library also supports the fast-xml-parser XML structures, and provides a basic API for working with them. Here’s how we can re-write the above example with programmatic XHTML generation:

withXml.ts

import { Epub, ManifestItem } from "@smoores/epub"

const epub = await Epub.from("path/to/book.epub")

// Construct a manifest item describing the chapter
const manifestItem: ManifestItem = {
  id: "chapter-one",
  // This is the filepath for the chapter contents within the
  // EPUB archive.
  href: "XHTML/chapter-one.xhtml",
  mediaType: "application/xhtml+xml",
}

// You can specify the contents as XML
const contents = epub.createXhtmlDocument([
  Epub.createXmlElement("h1", {}, [
    Epub.createXmlTextNode("Chapter 1")
  ]),
  Epub.createXmlElement("p", {}, [
    Epub.createXmlTextNode("At first, there were s'mores."),
  ]),
])

// First, add the new item to the manifest, and add
// its contents to the publication
await epub.addManifestItem(manifestItem, contents, "xml")

// Then add the item to the spine
await epub.addSpineItem(manifestItem.id)

// Finally, write back to disk
await epub.writeToFile("path/to/modified.epub")

For more details on how to use the library, you can check out the API docs on NPM!

And speaking of docs, I ended up spending quite a lot of time piecing together these docs in a way that I was happy with. I had a few fairly basic requirements:

The documentation should live in the README. I don’t have anything against documentation sites — Storyteller has one! — but this is a dedicated, single-purpose library, and I wanted to keep it simple.
At least part of the documentation should be hand-written. I wanted a narrative section at the beginning of the docs that I could write myself.
The table of contents should be automatically generated. Trying to manage it by hand was a recipe for disaster, I knew from previous experience, and it would inevitably become stale if its upkeep wasn’t automated.
There should be API docs automatically generated from the Typescript type information. I firmly believe that good docs requireboth hand-written narrative documentation and full API documentation. And I had the same worries about keeping the API docs up-to-date as the table of contents.

For the table of contents, I already had a tool that I enjoyed and was familiar with from my work on React ProseMirror — markdown-toc. For automatically generating API docs from Typescript, the best solution seemed to be TypeDoc, combined with the markdown and remark plugins.

In order to easily author my own narrative documentation, I created a readme-stub.md file. This is the only file that I actually update manually. Then I configured TypeDoc to generate a Markdown file with only the API docs in gen/README.md, which is gitignored. Here’s the full TypeDoc configuration:

typedoc.json

{
  "$schema": "https://typedoc-plugin-markdown.org/schema.json",
  // This tells TypeDoc where to start looking for types to document
  "entryPoints": ["./index.ts"],
  // I configure both the markdown and remark plugins.
  // The markdown plugin adds support for outputting the docs
  // as markdown; the remark plugin allows further customization
  // through remark plugins, configured below
  "plugin": ["typedoc-plugin-markdown", "typedoc-plugin-remark"],
  // We specifically don't want to merge with the existing readme;
  // we do that manually in a separate step, to avoid a giant
  // table of contents taking up the entire first page of
  // the readme
  "readme": "none",
  "mergeReadme": false,
  "gitRevision": "main",
  "outputFileStrategy": "modules",
  "out": "./gen",
  // Just some basic stylistic choices
  "hidePageHeader": true,
  "hideGroupHeadings": true,
  "formatWithPrettier": true,
  "parametersFormat": "table",
  // Here, we configure the remark-toc plugin, which creates a
  // table of contents for just the API docs
  "remarkPlugins": [["remark-toc", { "maxDepth": 4, "heading": "API Docs" }]],
  // Disable frontmatter and mdx remark plugins. MDX in particular
  // conflicts with markdown-toc, which relies on HTML-style comments
  // to identify where to insert the table of contents
  "defaultRemarkPlugins": {
    "gfm": true,
    "frontmatter": false,
    "mdx": false
  }
}

Finally, some package scripts allow us to set up the README compilation pipeline, generating the top-level table of contents, then generating the API docs, and finally merging the stub with the API docs to produce the final README:

package.json

{
  "name": "@smoores/epub",
  "scripts": {
    "build": "yarn swc ./index.ts -o ./index.js",
    // When we generate the top-level TOC, we manually add
    // an item for the API docs
    "readme:toc": "markdown-toc --maxdepth=5 --append='
- [API Docs](#api-docs)' --bullets='-' -i readme-stub.md",
    "readme:api": "typedoc",
    // TypeDoc always inserts the name of the package at the
    // top of the doc, so we skip the first line when we concatenate
    // it into the final README
    "readme": "yarn readme:api && yarn readme:toc && cat readme-stub.md > README.md && tail -n +2 gen/README.md >> README.md",
    "test": "tsx --test",
    "test:watch": "tsx --test --watch"
  }
}

And that’s all! I hope you give @smoores/epub a shot! If you run into any issues, you can open an Issue on the Storyteller GitLab repo.