Displaying HTML-formatted documents without HTML

Most of the documents displayed on my genealogy website conform to one of a limited number of formats and I’ve created an HTML template for each format required. However, I also have some free-form text documents which I wanted to display with as much of the original formatting intact as possible. I couldn’t use HTML tags for the formatting, though, because I can’t store HTML formatting in the DB behind my site: the interface flags the input as potentially dangerous code. The solution has been to store the data in a plain-text format that my application can then convert to HTML.

In the past I’ve worked with Redcloth and Bluecloth, which are Ruby implementations of Textile and Markdown respectively, in RoR sites. Markdown in its original incarnation is a Perl software tool that converts plain text marked up using its custom syntax into HTML. Textile does a similar sort of thing, but was originally implemented in PHP. Both have implementations in C#, so I based my decision on which one to use on the ease of use. Although Textile has some tags that would be useful that Markdown lacks (strikethrough, for example), Markdown uses a syntax which I find more attractive in its unconverted state ([link_text](link_address "title") in Markdown, as opposed to "(classname)link text(title tooltip)":link_address in Textile)

Markdown is used on the Stack Overflow site, and they have open-sourced their C# implementation as MarkdownSharp. Marked-up text can be converted to HTML using

var markdown = new Markdown();
var convertedText = m.Transform(inputText);

Also useful for my purposes is the JavaScript port of Markdown by John Fraser, Showdown. The download link on this site is currently broken, but there’s a version on GitHub here. A couple of lines of script, and I can get a rough idea of what my converted content will look like before saving the marked-up text:

        function preview() {
            var converter = new Showdown.converter();

So now I can display my great-great-grandfather’s letter with close to original formatting as well as the original rather quirky spelling…


Tag Markdown
<br /> Two spaces at the end of the line
<em> *asterisks* or _underscores_
<strong> **double asterisks** or __double underscores__
<a> [Text](href “title”)
<h1> heading 1
or # heading 1
<h2> heading 2
or ## heading 2
<hr/> ***
<ul> * asterisk
<ol> 1. number followed by a period then a space
<blockquote> > angled bracket
<img> ![alt text](/path/to/img.jpg “Title”)
<code> `backticks`

Daring Fireball Markdown: basics
Stack Overflow Markdown Reference


About Jennifer Phillips Campbell

Software Developer and Medieval Historian
This entry was posted in JavaScript. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s