CentralTools
Documentation

Clean HTML to Markdown Converter for GitHub READMEs

Migrating documentation to GitHub? Discover the easiest way to convert messy HTML into clean, GitHub-flavored Markdown for perfectly formatted READMEs.

3 min read

Key Takeaways

  • Markdown is the language of GitHub. HTML is supported but discouraged for READMEs.
  • Automated converters save hours when migrating blogs or wikis to GitHub.
  • Look for "GitHub Flavored Markdown" (GFM) support to handle tables and checklists correctly.
  • Always review the output: nested lists and complex tables often break during conversion.

You have a beautiful project documentation site built in HTML/CMS, but now you want to move it to a GitHub repository's README.md or Wiki. Copy-pasting the text loses formatting. Copy-pasting the HTML code is ugly and hard to edit.

The solution is an HTML to Markdown Converter.

GitHub Standard

Markdown files on GitHub render instantly. They are lightweight, version-controllable, and readable even as raw text. HTML files in repos usually just show the raw code.

Challenges in HTML → Markdown Conversion

It sounds simple, but it's tricky. HTML has nested `div`s, spans with styles, and complex tables. Markdown is flat and simple.

HTML Element Conversion Challenge Markdown Solution
<a href="..."> Easy [Link](url)
<table> Hard (rowspans not supported) GFM Tables (pipes |)
<div class="alert"> No semantic equivalent > Blockquote

Best Practices for Clean Conversion

  1. Strip Styles First: Remove inline CSS (`style="..."`) and class names. They don't translate to Markdown.
  2. Flatten Structure: Replace deep nesting of divs with simple headers (`h2`, `h3`).
  3. Fix Images: Relative paths in HTML (`src="/img/logo.png"`) will break on GitHub unless the file exists in the repo. Update paths to absolute URLs or commit the images.

Frequently Asked Questions

Does GitHub support all HTML tags?

GitHub sanitizes HTML in Markdown files. Only basic tags like `div`, `span`, `img`, and `table` are allowed. `