As AI agents and crawlers increasingly consume web content, developers need to optimize their sites for a "third audience" beyond humans and traditional search engines. This post demonstrates how to implement Markdown endpoints in Next.js 14+ using the App Router, complete with auto-discovery links that help AI crawlers find and consume your content more efficiently.
Inspired by Dries Buytaert's "The Third Audience" experiment, we'll build a system that serves clean Markdown versions of your content alongside the standard HTML experience.
AI crawlers can parse HTML, but they have to wade through navigation menus, wrapper divs, and other UI elements to extract actual content. Markdown provides:
Our solution consists of three components:
This tutorial assumes you have:
next-drupalFirst, add the required packages:
npm install next-drupal turndown
npm install --save-dev @types/turndownSet up your Drupal client in lib/drupal.ts:
import { DrupalClient } from "next-drupal";
export const drupal = new DrupalClient(
process.env.NEXT_PUBLIC_DRUPAL_BASE_URL,
{
auth: {
clientId: process.env.DRUPAL_CLIENT_ID,
clientSecret: process.env.DRUPAL_CLIENT_SECRET,
},
}
);Create a new catch-all route at app/md/[...slug]/route.ts:
import { drupal } from "@/lib/drupal";
import { DrupalNode } from "next-drupal";
import TurndownService from "turndown";
export async function GET(
request: Request,
{ params }: { params: { slug: string[] } }
) {
const path = `/${params.slug.join("/")}`;
try {
// Fetch the resource from Drupal using the path
const resource = await drupal.getResourceByPath<DrupalNode>(path);
if (!resource) {
return new Response("Not Found", { status: 404 });
}
const converter = new TurndownService({
headingStyle: "atx",
codeBlockStyle: "fenced",
});
// Build metadata object from Drupal fields
const metadata: Record<string, string | null> = {
title: resource.title,
date: resource.created,
summary: resource.body?.summary || null,
type: resource.type.replace("node--", ""),
url: resource.path?.alias || path,
id: resource.id,
};
// Add image if available
if (resource.field_image?.uri?.url) {
metadata.image = `${process.env.NEXT_PUBLIC_DRUPAL_BASE_URL}${resource.field_image.uri.url}`;
}
// Generate markdown with YAML frontmatter
const markdown = [
"---",
...Object.entries(metadata).map(([key, value]) => {
if (value === null) {
return `${key}:`;
}
// Escape double quotes in the value
const escapedValue = value.replace(/"/g, '\\"');
return `${key}: "${escapedValue}"`;
}),
"---",
"",
converter.turndown(resource.body?.processed || ""),
].join("\n");
return new Response(markdown, {
status: 200,
headers: {
"Content-Type": "text/markdown",
"Cache-Control": "public, max-age=3600",
},
});
} catch (error) {
console.error("Error fetching resource:", error);
return new Response("Internal Server Error", { status: 500 });
}
}Update your metadata generation to include alternate format discovery. In your page or layout file:
import { drupal } from "@/lib/drupal";
import { DrupalNode } from "next-drupal";
import type { Metadata } from "next";
export async function generateMetadata({
params,
}: {
params: { slug: string[] };
}): Promise<Metadata> {
const path = `/${params.slug.join("/")}`;
const resource = await drupal.getResourceByPath<DrupalNode>(path);
if (!resource) {
return {};
}
const baseUrl = process.env.NEXT_PUBLIC_BASE_URL || "https://example.com";
return {
title: resource.title,
description: resource.body?.summary,
alternates: {
canonical: `${baseUrl}${resource.path?.alias || path}`,
types: {
"text/markdown": [
{
url: `${baseUrl}/md${resource.path?.alias || path}`,
title: `${resource.title} | Markdown Version`,
},
],
},
},
};
}This generates HTML like:
<link rel="alternate" type="text/markdown"
href="/md/your-article-path"
title="Your Article | Markdown Version" />/your-article/md/your-articleIn Dries' experiment, AI crawlers (ClaudeBot, GPTBot, OpenAI's SearchBot) found and consumed the Markdown endpoints within hours. The auto-discovery pattern borrowed from RSS proved immediately effective.
const converter = new TurndownService({
headingStyle: "atx",
codeBlockStyle: "fenced",
bulletListMarker: "-",
});
// Add custom rules for special markup
converter.addRule("strikethrough", {
filter: ["del", "s"],
replacement: (content) => `~~${content}~~`,
});
// Preserve certain HTML elements
converter.keep(["iframe"]);export async function GET(request: Request) {
const acceptHeader = request.headers.get("accept");
if (acceptHeader?.includes("text/markdown")) {
// Serve markdown
}
// Fallback behavior
}const metadata: Record<string, string | null> = {
title: resource.title,
date: resource.created,
author: resource.uid?.display_name || null,
tags: resource.field_tags?.map((tag) => tag.name).join(", ") || null,
category: resource.field_category?.name || null,
// Add any custom fields from your content type
};Caching: Markdown conversion can be cached aggressively since content changes infrequently. Consider implementing edge caching with Vercel or CloudFlare, or use Next.js's built-in caching strategies.
Performance: The drupal.getResourceByPath() method supports caching out of the box. For extremely high-traffic sites, consider pre-generating Markdown at build time using generateStaticParams().
SEO Impact: This approach maintains your existing HTML experience while adding an alternate format. It shouldn't negatively impact traditional SEO.
Content Licensing: Include licensing information in your frontmatter to clarify usage rights for AI training:
const metadata = {
// ... other fields
license: "CC-BY-4.0",
canonical: `${baseUrl}${resource.path?.alias}`,
};Add these to your .env.local:
NEXT_PUBLIC_DRUPAL_BASE_URL=https://your-drupal-site.com
DRUPAL_CLIENT_ID=your-client-id
DRUPAL_CLIENT_SECRET=your-client-secret
NEXT_PUBLIC_BASE_URL=https://your-nextjs-site.comServing Markdown versions of your content is a straightforward way to optimize for AI agents. Using Next.js App Router's flexible routing system and the next-drupal package, you can implement this feature with minimal code while maintaining full control over what content is exposed and how it's formatted.
As the "third audience" of AI agents becomes increasingly important, providing cleaner, structured content formats may become as essential as traditional SEO optimization.