The Complete Guide to Markdown

Markdown is the lingua franca of developer documentation. This guide covers where it came from, why there are so many flavors, what happens when you convert it to HTML, and the security implications of rendering it.


Markdown's origin

John Gruber and Aaron Swartz created Markdown in 2004 with a simple thesis: plain text should be readable as-is and convert cleanly to valid HTML. The syntax wasn't invented from scratch — it was modeled on how people already formatted plain-text email. Asterisks for emphasis, dashes for lists, blank lines for paragraphs. If you'd ever written a text-only email, you already knew most of Markdown.

Gruber published a Perl script that handled the conversion, plus a syntax description that served as an informal spec. It was intentionally loose — covering the common cases and leaving edge cases undefined. That decision would cause problems later.


CommonMark: the spec that brought order

The original Markdown description was ambiguous. What happens when you nest a blockquote inside a list? When a paragraph and a code block share the same indentation level? Different parsers gave different answers. The same Markdown could produce different HTML depending on which tool processed it.

CommonMark launched in 2014 to fix this. It's a formal specification with over 600 test cases that define exact behavior for every edge case. Most modern parsers — including marked, markdown-it, and remark — follow CommonMark or a close superset of it.

GitHub Flavored Markdown (GFM) extends CommonMark with features GitHub needed:

| Column A | Column B |
| -------- | -------- |
| Tables   | work     |

- [x] Task lists
- [ ] with checkboxes

~~Strikethrough~~ and autolinked URLs: https://example.com

If you're writing for GitHub, you're writing GFM whether you know it or not.


Markdown flavors

Not all Markdown is the same. Knowing which flavor your platform expects saves debugging time.

Flavor Notable additions Used by
CommonMark Base spec, strict parsing rules Most modern parsers
GFM Tables, task lists, strikethrough, autolinks GitHub, GitLab
MDX JSX components inline: <Chart data={sales} /> Docusaurus, Next.js, Astro
MultiMarkdown Footnotes, citations, metadata Academic writing
Pandoc's Markdown Citations, figure captions, TeX math Academic papers, eBooks

A - [x] Done checkbox renders on GitHub but shows as literal text in a basic CommonMark parser. Always check what your target platform supports.


Markdown to HTML: what's actually happening

When you convert Markdown to HTML, three things happen under the hood:

  1. Tokenization — the parser scans the input and identifies block-level elements (headings, paragraphs, lists, code blocks) and inline elements (bold, italic, links, code spans)
  2. AST construction — tokens are organized into an abstract syntax tree that represents the document structure
  3. Serialization — the AST is walked and each node is rendered as its HTML equivalent

This tool uses marked, a fast CommonMark-compatible parser written in JavaScript. Other popular parsers include markdown-it (plugin-based, extensible), remark (part of the unified ecosystem, works with ASTs), and micromark (small, spec-compliant).


Where Markdown is used

Markdown shows up everywhere in a developer's workflow:

If you write code, you write Markdown. It's unavoidable.


Markdown security

Raw Markdown can contain arbitrary HTML. That's a feature — Gruber's original spec explicitly allows it. But if you render user-submitted Markdown without sanitization, you're opening yourself to XSS:

Click here: <img src=x onerror="alert(document.cookie)">

A naive Markdown-to-HTML pipeline passes that straight through. The browser executes the onerror handler, and you have a vulnerability.

Always sanitize rendered Markdown when the source is untrusted. Use DOMPurify on the client or sanitize-html on the server. GitHub, GitLab, and every major platform strip <script>, <style>, event handlers, and dangerous attributes before rendering.


Troubleshooting

Line breaks aren't rendering — Markdown treats a single newline as a space, not a <br>. Either add two trailing spaces at the end of a line, use an explicit <br> tag, or leave a blank line between paragraphs.

Nested lists aren't indenting correctly — Indent nested items by 2 or 4 spaces (be consistent). Also make sure there's a blank line before the first list item — some parsers require it for proper list detection.

HTML in Markdown isn't appearing — Most parsers pass HTML through, but platforms like GitHub sanitize it. Tags like <script>, <style>, <iframe>, and event handler attributes (onclick, onerror) are stripped. If your HTML disappears, the platform is filtering it for security.

Tables aren't rendering — Tables require a header separator row (| --- | --- |) and are a GFM extension, not part of base CommonMark. If your parser only supports CommonMark, tables will render as plain text. Check that your parser or platform supports GFM.