Markdown to HTML Using Python

The following shows how to use Python to convert a Markdown file to an HTML file:

Almost all of my web pages are written in Markdown. Even this one. But I hate to start up some programming IDE or upload my Markdown file to some random website just to convert it to HTML. So, I wrote a couple of simple Python scripts.

Using Python3, you will also need to download a couple of Python packages. The first is Python-Markdown. On my Linux / Debian machine it is as simple as:

sudo apt install python3-markdown

For the second script on this page, you will also need Glob:

sudo apt install glob

Simple Python Script

I will break it into parts to explain, and then I will put it all together in the end.

Set up your Python script with the usual stuff:

#!/usr/bin/env python3

import markdown

Then set the input and output files with a variable. Yes, you could ask for the file with an input() but this is supposed to be simple. After that, read the file and then markdown.markdown (lowercase) to convert:

infile = 'directory/example.md'
outfile = 'directory/example.html'

with open(infile, 'r') as f:
    html = markdown.markdown(f.read())

This will only convert the scribblings in your Markdown file to HTML. It will be missing the opening and closing HTML tags your HTML file will need. So just set this text into string variables. And, these strings are lazily catenated without using the + operator, with the \n added for new lines in the final file:

top = (
    "<!DOCTYPE html>\n"
    "<html>\n"
    "\n"
    "<body>\n"
    "\n"
)

bottom = (
    "\n"
    "\n"
    "</body>\n"
    "\n"
    "</html>\n"
)

And then simply catenate the parts together and write it all to the output file:

whole = top + html + bottom

print(outfile)
with open(outfile, 'w') as f:
    f.write(whole)

And now the whole thing put together for your cut-paste convenience:

#!/usr/bin/env python3

import markdown

infile = 'directory/example.md'
outfile = 'directory/example.html'

with open(infile, 'r') as f:
    html = markdown.markdown(f.read())

top = (
    "<!DOCTYPE html>\n"
    "<html>\n"
    "\n"
    "<body>\n"
    "\n"
)

bottom = (
    "\n"
    "\n"
    "</body>\n"
    "\n"
    "</html>\n"
)

whole = top + html + bottom

print(outfile)
with open(outfile, 'w') as f:
    f.write(whole)

Simple Python Script for a Whole Website

If you have a whole bunch of markdown files, all of which need to go into a bunch of subdirectories, this might be your script. First the setup...

I like to keep my Markdown in subdirectories that match the HTML subdirectory structure. Here's an example:

working_directory
    |
    |-- this_markdown_script.py
    |
    |-- markdown
    |   |-- index.md
    |   |-- contact.md
    |   '-- projects
    |       |-- project1.md
    |       '-- project2.md
    |
    '-- website
        |-- index.html
        |-- contact.html
        '-- projects
            |-- project1.html
            '-- project2.html

So what this script will do is parse through the markdown directory, convert several files, and then place them in the website directory. The goal is to make updating the website easier. You can just copy-paste the website files to the actual website directory. Also, this script extracts metadata, sets up a more complete HTML file, etc.

Again, start with the usual Python3 stuff:

#!/usr/bin/env python3

import markdown
import glob

The curveball here is to create an instance of the markdown.Markdown (uppercase) class. And, we will include an extension to acquire the metadata in the markdown file:

md = markdown.Markdown(extensions=['meta'])

Actually, let's add another extension... The base markdown class converts using the least amount of HTML tags. For additional features, you add extensions. To add id information to headers and automatically create a Table of Contents, we will also include the toc extension:

md = markdown.Markdown(extensions=['meta', 'toc'])

Then use glob to get a list of the files recursively in the directory structure:

files = glob.glob('markdown/**/*.md', recursive='true')

Iterate through the files. For each file, set the fileName variable but strip off the markdown/ directory from the string [9:] and strip off the .md extension from the string [:-3]. Then, using the Markdown instance, clear it with a reset() (in case the previous data is still in there) and then convert() to convert it:

for file in files :

    fileName = file[9:][:-3]
    outfile = 'website/' + fileName + '.html'

    with open(file, 'r') as f:
        html = md.reset().convert(f.read())

Extracting metadata is now an option when using a Markdown class instance with extension=['meta']. It's stored as a Python Dictionary with a key and value, and the value is a list (since it could be represented by more than one line in the Markdown file).

To store the value in a string, iterate through the list:

    meta = md.Meta

    metaTitle = ''
    metaDate = ''
    for data in meta['title'] :
        metaTitle += data
    for data in meta['date'] :
        metaDate += data

And again we set up the other parts of the HTML file, but this time note that the metadata is included:

    top = (
        "<!DOCTYPE html>\n"
        "<html>\n"
        "\n"
        "<head>\n"
        "   <meta name=\"title\" content=\"" + metaTitle + "\">\n"
        "   <meta name=\"date\" content=\"" + metaDate + "\">\n"
        "   <link rel='stylesheet' type='text/css' href='/mystylesheet.css'>\n"
        "</head>\n"
        "\n"
        "<body>\n"
        "   <div class=page>\n"
        "       <div class='header'>\n"
        "       </div>\n"
        "       <div class='main'>\n"
        "           <div style='width: 66vw;'>\n"
        "               <div>\n"
        "\n"
    )

    bottom = (
        "\n"
        "\n"
        "               </div>\n"
        "           </div>\n"
        "       </div>\n"
        "       <div class='footer'>\n"
        "           <a  href='/index.html'><button class='button'>Home</button></a>\n"
        "       </div>\n"
        "   </div>\n"
        "\n"
        "</body>\n"
        "\n"
        "</html>\n"
    )

Again, put it together and then write it all to a file:

    whole = top + html + bottom

    print(outfile)
    with open(outfile, 'w') as f:
        f.write(whole)

And now the whole thing for your cut-paste convenience:

#!/usr/bin/env python3

import markdown
import glob

md = markdown.Markdown(extensions=['meta', 'toc'])

files = glob.glob('markdown/**/*.md', recursive='true')
for file in files :

    fileName = file[9:][:-3]
    outfile = 'website/' + fileName + '.html'

    with open(file, 'r') as f:
        html = md.reset().convert(f.read())
    meta = md.Meta

    metaTitle = ''
    metaDate = ''
    for data in meta['title'] :
        metaTitle += data
    for data in meta['date'] :
        metaDate += data

    top = (
        "<!DOCTYPE html>\n"
        "<html>\n"
        "\n"
        "<head>\n"
        "   <meta name=\"title\" content=\"" + metaTitle + "\">\n"
        "   <meta name=\"date\" content=\"" + metaDate + "\">\n"
        "   <link rel='stylesheet' type='text/css' href='/mystylesheet.css'>\n"
        "</head>\n"
        "\n"
        "<body>\n"
        "   <div class=page>\n"
        "       <div class='header'>\n"
        "       </div>\n"
        "       <div class='main'>\n"
        "           <div style='width: 66vw;'>\n"
        "               <div>\n"
        "\n"
    )

    bottom = (
        "\n"
        "\n"
        "               </div>\n"
        "           </div>\n"
        "       </div>\n"
        "       <div class='footer'>\n"
        "           <a  href='/index.html'><button class='button'>Home</button></a>\n"
        "       </div>\n"
        "   </div>\n"
        "\n"
        "</body>\n"
        "\n"
        "</html>\n"
    )

    whole = top + html + bottom

    print(outfile)
    with open(outfile, 'w') as f:
        f.write(whole)

End




Copyright, provided by, and under the protection of Worktable CNC, LLC.
Details and Terms of Use may be found here: worktablecnc.us/legal