Catches broken content before your readers do: links that go nowhere, pages that will not load, and frontmatter your code can no longer read.
Some of your content is wired to other things. A link is wired to another page. The body of a page is wired to whatever renders it. A frontmatter field is wired to the code that reads it by name.
Any of those wires can break. The painful part is that they break quietly. The page does not crash and nothing turns red. It just ships wrong, and you find out from a reader, or from Google Search Console, weeks later.
- A page gets renamed, and every old link to it now leads to a dead end.
- Someone leaves a stray character in a page, and that one page stops loading entirely.
- Someone renames a frontmatter field, and the code that built that part of the page can no longer find it, so the page ships with a blank section.
A spell-checker or a style linter cannot catch any of this, because the words still look fine. The only way to catch a broken wire is to check both ends of it. That is the whole job of this tool: it checks both ends of every wire in your content, before you deploy.
| Check | In plain words | The silent failure it prevents |
|---|---|---|
| Links resolve | Every internal link points to a page you actually have | A renamed or deleted page leaves dead links that 404 |
| Pages render | Every page actually compiles | One stray character takes a page down at render time |
| Frontmatter parses | The settings block at the top of each file is valid | A typo up top breaks the build |
| Fields match | Each field your code reads is named and shaped the way the code expects | You rename a field, the code cannot find it, and that part of the page ships blank |
The first three are the obvious ones. The last one is the one nothing else catches, and it is the reason this tool exists.
Here is the bug that made me build this. A site I run keeps each guide's FAQs in its frontmatter, and the code turns that list into two things: the FAQ section readers see, and the hidden markup that makes those questions show up directly in Google results. The code expected each entry to be labeled q and a. One day a batch of translated guides arrived labeled question and answer instead. Nothing crashed. The guides looked perfect in the editor and in preview. But every FAQ section shipped empty, and the questions quietly fell out of Google. I did not notice for weeks, until Google Search Console reported the markup had broken across a dozen pages.
A spell-checker would not catch it. A link-checker would not catch it. The labels were simply wrong, and only the code knew which labels were right. So this tool lets you write the right labels down once, "the faq field must have q and a," and it catches the mismatch the moment it appears, before anything ships. You decide which fields matter and what shape they take, and the tool holds your content to it.
That FAQ bug was not a one-off. Each check here is in the tool because something like it broke in production on a real multi-language content site:
- A leftover review note took whole pages down. A translator left an HTML comment (
<!-- check this -->) in a guide. MDX does not allow HTML comments, so the entire guides listing for two languages returned a 500. (Pages render.) - Renamed guides left dead links across five languages. Internal links kept pointing at slugs that no longer existed, and they 404'd for weeks, burning crawl budget. (Links resolve.)
- A "HowTo" with no steps quietly became a plain article. A page declared
schema: HowTobut the steps were dropped in translation, so the Google rich result silently downgraded. (Fields match.) - A stray quote in the frontmatter broke the build. An unescaped
"inside a YAML string crashed the parser. (Frontmatter parses.)
The demo runs against examples/content/, which deliberately contains broken files so you can see what a catch looks like. It exits non-zero on purpose:
mdx-validate: FAIL. 5 error(s) across 4 of 5 file(s).
examples/content/broken-faq.mdx
[SHAPE] faq[0] is missing q, a (it has: question, answer); your code reads { q, a } and will silently drop this entry
examples/content/dead-link.mdx
[BROKEN-LINK] links to "/guides/this-page-was-renamed" but no content resolves to slug "this-page-was-renamed"
examples/content/howto-no-steps.mdx
[SHAPE-MISSING] "steps" is required here but missing (expected an array of { name, text })
examples/content/html-comment.mdx
[HTML-COMMENT] body contains an HTML comment <!-- ... -->; MDX rejects it (use {/* ... */} or remove)
[MDX-COMPILE] Unexpected character `!` (U+0021) before name, expected a character that can start a name
The one file that passes, good-guide.mdx, is not listed. Clean files stay quiet.
See it work:
git clone https://github.com/hwajongpark/mdx-validate
cd mdx-validate
npm install
npm run demoUse it on your own content:
npm install --save-dev mdx-validate
# copy the example config and point it at your content
cp node_modules/mdx-validate/examples/mdx-validate.config.example.json ./mdx-validate.config.json
# edit mdx-validate.config.json, then:
npx mdx-validateWire it into your build so a broken page fails the deploy, not the reader:
{
"scripts": {
"prebuild": "mdx-validate"
}
}One mdx-validate.config.json at your project root. The included examples/mdx-validate.config.example.json is the demo config.
{
"contentDir": "content",
"extensions": [".mdx"],
"requiredFrontmatter": ["title", "description"],
"shapeContracts": [
{ "field": "faq", "itemShape": ["q", "a"] },
{ "field": "sources", "itemShape": ["label", "url"] },
{ "field": "steps", "itemShape": ["name", "text"], "when": { "field": "schema", "equals": "HowTo" } }
],
"internalLink": {
"pattern": "/guides/([a-z0-9-]+)",
"resolveTargetsFrom": "content",
"targetExtensions": [".mdx"],
"extraValidSlugs": []
}
}contentDir: the folder to check. No glob needed; it walks the tree.requiredFrontmatter: fields that must be present and not empty.shapeContracts: the "fields match" check. Each line says "this field must be a list of items with these keys." Add"when"to make it conditional, sostepsis only required whenschemaisHowTo.internalLink: a pattern with one capture group for the slug, and where to find valid slugs (the filenames of your content).extraValidSlugscovers pages served by a redirect instead of a file.
It checks both ends of every wire. A link is only good if the page it points to exists. A field is only good if it is shaped the way the code that reads it expects. The tool knows both ends and compares them.
You define what matters, not the tool. Your fields and shapes are yours to declare. The tool has no opinion about your content, only that your content matches the rules you wrote down.
It compiles the real MDX, it does not guess. The render check runs the actual MDX compiler, so it catches anything that would throw on a real page, not just patterns a regex anticipated.
It reports, you fix. It finds the break and tells you exactly where. Fixing is your call, because sometimes a change is intentional. It never rewrites your content.
Almost no moving parts. It walks folders with the standard library, reads frontmatter with gray-matter, and compiles with @mdx-js/mdx. That is the entire toolchain. Exit codes are built for CI: 0 clean, 1 something broke, 2 a config or internal error.
- It is not a style or spelling checker. For formatting and prose, use a style linter or Prettier. This checks that things are wired correctly, not that they read well.
- It does not check that external websites are up. It only checks that your own internal links point to content you have.
- It does not auto-fix. It shows you every break; you decide the fix.
Contributions are welcome, and the most useful kind is a class of silent break this tool does not yet catch. If you have hit one, open an issue describing it, or a pull request. The fastest way to land a fix is a reproduction in examples/content/: a small broken file plus the catch you expected. Bug reports and false positives are welcome too.
