Move content from HTML to Markdown
You’re migrating 200 blog posts from WordPress to Hugo. Or you scraped a documentation page and want it in a readable format for your notes. Or you’ve got an HTML email and just want the clean text content. HTML is verbose, all those opening and closing tags make it hard to read as a human. Markdown gives you the same content with a fraction of the syntax.
Paste the HTML, click convert, get Markdown. <h2> becomes ##. <strong> becomes **bold**. <a href="..."> becomes [text](url). Script and style tags get stripped automatically, you just get the content.
What gets converted
<h1> through <h6> → # through ######
<strong> and <b> → **bold**
<em> and <i> → *italic*
<a href="url">text</a> → [text](url)
<img src="url" alt="text"> → 
<ul>/<li> → - item
<ol>/<li> → 1. item
<pre><code> → fenced code blocks with triple backticks
<code> → `inline code`
<blockquote> → > quote
<hr> → ---
Paragraphs become text separated by blank lines. Line breaks are preserved. Classes, IDs, and inline styles are dropped, Markdown doesn’t support them.
Migration scenarios
WordPress to static site generators: Jekyll, Hugo, Gatsby, Docusaurus all use Markdown. Convert your exported HTML posts and you’re most of the way there.
Content cleanup: your text has been through three different CMS systems and accumulated layers of <span> and <div> cruft. Converting to Markdown strips it all away, leaving just the content.
Documentation: you found great API docs on a website and want them in your team’s Markdown-based wiki. Convert and paste.
Archiving: Markdown is plain text. It’ll be readable in 20 years without any special software. HTML might need a browser.
For the reverse, Markdown to HTML, there’s the Markdown to HTML converter.
FAQ
Does it handle messy real-world HTML?
It handles most HTML from web pages and CMS platforms well. Extremely complex or deeply nested markup might need minor cleanup after conversion.
Script and style tags?
Automatically stripped, including their contents. You get clean content only.
Classes and attributes?
Dropped. Markdown doesn’t support them, so they’re removed during conversion.
Client-side?
Yes, your HTML never leaves your browser.