mdx-validate

Catches broken content before your readers do: links that go nowhere, pages that will not load, and frontmatter your code can no longer read.

The Problem

Some of your content is wired to other things. A link is wired to another page. The body of a page is wired to whatever renders it. A frontmatter field is wired to the code that reads it by name.

Any of those wires can break. The painful part is that they break quietly. The page does not crash and nothing turns red. It just ships wrong, and you find out from a reader, or from Google Search Console, weeks later.

A page gets renamed, and every old link to it now leads to a dead end.
Someone leaves a stray character in a page, and that one page stops loading entirely.
Someone renames a frontmatter field, and the code that built that part of the page can no longer find it, so the page ships with a blank section.

A spell-checker or a style linter cannot catch any of this, because the words still look fine. The only way to catch a broken wire is to check both ends of it. That is the whole job of this tool: it checks both ends of every wire in your content, before you deploy.

What it checks

Check	In plain words	The silent failure it prevents
Links resolve	Every internal link points to a page you actually have	A renamed or deleted page leaves dead links that 404
Pages render	Every page actually compiles	One stray character takes a page down at render time
Frontmatter parses	The settings block at the top of each file is valid	A typo up top breaks the build
Fields match	Each field your code reads is named and shaped the way the code expects	You rename a field, the code cannot find it, and that part of the page ships blank

The first three are the obvious ones. The last one is the one nothing else catches, and it is the reason this tool exists.

Here is the bug that made me build this. A site I run keeps each guide's FAQs in its frontmatter, and the code turns that list into two things: the FAQ section readers see, and the hidden markup that makes those questions show up directly in Google results. The code expected each entry to be labeled q and a. One day a batch of translated guides arrived labeled question and answer instead. Nothing crashed. The guides looked perfect in the editor and in preview. But every FAQ section shipped empty, and the questions quietly fell out of Google. I did not notice for weeks, until Google Search Console reported the markup had broken across a dozen pages.

A spell-checker would not catch it. A link-checker would not catch it. The labels were simply wrong, and only the code knew which labels were right. So this tool lets you write the right labels down once, "the faq field must have q and a," and it catches the mismatch the moment it appears, before anything ships. You decide which fields matter and what shape they take, and the tool holds your content to it.

Why these checks exist

That FAQ bug was not a one-off. Each check here is in the tool because something like it broke in production on a real multi-language content site:

A leftover review note took whole pages down. A translator left an HTML comment () in a guide. MDX does not allow HTML comments, so the entire guides listing for two languages returned a 500. (Pages render.)
Renamed guides left dead links across five languages. Internal links kept pointing at slugs that no longer existed, and they 404'd for weeks, burning crawl budget. (Links resolve.)
A "HowTo" with no steps quietly became a plain article. A page declared schema: HowTo but the steps were dropped in translation, so the Google rich result silently downgraded. (Fields match.)
A stray quote in the frontmatter broke the build. An unescaped " inside a YAML string crashed the parser. (Frontmatter parses.)

Demo

The demo runs against examples/content/, which deliberately contains broken files so you can see what a catch looks like. It exits non-zero on purpose:

mdx-validate: FAIL. 5 error(s) across 4 of 5 file(s).

examples/content/broken-faq.mdx
  [SHAPE] faq[0] is missing q, a (it has: question, answer); your code reads { q, a } and will silently drop this entry

examples/content/dead-link.mdx
  [BROKEN-LINK] links to "/guides/this-page-was-renamed" but no content resolves to slug "this-page-was-renamed"

examples/content/howto-no-steps.mdx
  [SHAPE-MISSING] "steps" is required here but missing (expected an array of { name, text })

examples/content/html-comment.mdx
  [HTML-COMMENT] body contains an HTML comment <!-- ... -->; MDX rejects it (use {/* ... */} or remove)
  [MDX-COMPILE] Unexpected character `!` (U+0021) before name, expected a character that can start a name

The one file that passes, good-guide.mdx, is not listed. Clean files stay quiet.

Quick Start

See it work:

git clone https://github.com/hwajongpark/mdx-validate
cd mdx-validate
npm install
npm run demo

Use it on your own content:

npm install --save-dev mdx-validate

# copy the example config and point it at your content
cp node_modules/mdx-validate/examples/mdx-validate.config.example.json ./mdx-validate.config.json

# edit mdx-validate.config.json, then:
npx mdx-validate

Wire it into your build so a broken page fails the deploy, not the reader:

{
  "scripts": {
    "prebuild": "mdx-validate"
  }
}

Configuration

One mdx-validate.config.json at your project root. The included examples/mdx-validate.config.example.json is the demo config.

{
  "contentDir": "content",
  "extensions": [".mdx"],
  "requiredFrontmatter": ["title", "description"],
  "shapeContracts": [
    { "field": "faq", "itemShape": ["q", "a"] },
    { "field": "sources", "itemShape": ["label", "url"] },
    { "field": "steps", "itemShape": ["name", "text"], "when": { "field": "schema", "equals": "HowTo" } }
  ],
  "internalLink": {
    "pattern": "/guides/([a-z0-9-]+)",
    "resolveTargetsFrom": "content",
    "targetExtensions": [".mdx"],
    "extraValidSlugs": []
  }
}

contentDir: the folder to check. No glob needed; it walks the tree.
requiredFrontmatter: fields that must be present and not empty.
shapeContracts: the "fields match" check. Each line says "this field must be a list of items with these keys." Add "when" to make it conditional, so steps is only required when schema is HowTo.
internalLink: a pattern with one capture group for the slug, and where to find valid slugs (the filenames of your content). extraValidSlugs covers pages served by a redirect instead of a file.

How It Works

It checks both ends of every wire. A link is only good if the page it points to exists. A field is only good if it is shaped the way the code that reads it expects. The tool knows both ends and compares them.

You define what matters, not the tool. Your fields and shapes are yours to declare. The tool has no opinion about your content, only that your content matches the rules you wrote down.

It compiles the real MDX, it does not guess. The render check runs the actual MDX compiler, so it catches anything that would throw on a real page, not just patterns a regex anticipated.

It reports, you fix. It finds the break and tells you exactly where. Fixing is your call, because sometimes a change is intentional. It never rewrites your content.

Almost no moving parts. It walks folders with the standard library, reads frontmatter with gray-matter, and compiles with @mdx-js/mdx. That is the entire toolchain. Exit codes are built for CI: 0 clean, 1 something broke, 2 a config or internal error.

What It Does Not Do

It is not a style or spelling checker. For formatting and prose, use a style linter or Prettier. This checks that things are wired correctly, not that they read well.
It does not check that external websites are up. It only checks that your own internal links point to content you have.
It does not auto-fix. It shows you every break; you decide the fix.

Contributing

Contributions are welcome, and the most useful kind is a class of silent break this tool does not yet catch. If you have hit one, open an issue describing it, or a pull request. The fastest way to land a fix is a reproduction in examples/content/: a small broken file plus the catch you expected. Bug reports and false positives are welcome too.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mdx-validate

The Problem

What it checks

Why these checks exist

Demo

Quick Start

Configuration

How It Works

What It Does Not Do

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mdx-validate

The Problem

What it checks

Why these checks exist

Demo

Quick Start

Configuration

How It Works

What It Does Not Do

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages