Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
Ben Nadel at the jQuery Conference 2011 (Cambridge, MA) with: Doug Neiner
Ben Nadel at the jQuery Conference 2011 (Cambridge, MA) with: Doug Neiner@dougneiner )

Using Flexmark 0.32.24 To Parse Markdown Content Into HTML Output In ColdFusion

By Ben Nadel on
Tags: ColdFusion

Yesterday, I took a look at using the OWASP (Open Web Application Security Project) AntiSamy project in ColdFusion in order to help sanitize and validate untrusted, user-provided HTML content. I'm particularly interested in AntiSamy because I want to enable markdown in my blog comments. However, since markdown allows arbitrary embedded HTML code, I needed to make sure that I would have a way to prevent XSS (Cross-Site Scripting) attacks. And now that I know AntiSamy can protect me, I want to look at how to actually enable markdown. For this, I'll be using the Flexmark Java library, which implements the CommonMark specification.

View this code in my Flexmark 0.32.24 With ColdFusion project on GitHub.

Flexmark appears to be a very extensible Markdown implementation. Out of the box, the API surface area is very simple. However, you can start to add parsing and rendering extensions that enable additional markdown "flavors" and a variety of formatting add-ons. For example, I want to allow for plain-text URLs to be automagically converted into HTML Anchor tags. The Flexmark core library doesn't support this. But, it does provide an "Autolinking" extension that does. The extensions are all provided as additional JAR files, making the entire Flexmark project very modular.

To load the Flexmark JAR files in our ColdFusion application, I'm going to use Mark Mandel's JavaLoader project - the gift that just keeps on giving to the ColdFusion community. For the purposes of this demo, I'm going to be loading the core Flexmark files plus the extension (and its dependencies) for the Autolinking feature. I'll be creating and caching the JavaLoader instances during the bootstrapping of our ColdFusion application file:

  • component
  • output = false
  • hint = "I provide the application settings and event handlers."
  • {
  •  
  • // Define the application.
  • this.name = hash( getCurrentTemplatePath() );
  • this.applicationTimeout = createTimeSpan( 0, 0, 10, 0 );
  • this.sessionManagement = false;
  •  
  • // Setup the application mappings.
  • this.directory = getDirectoryFromPath( getCurrentTemplatePath() );
  • this.mappings[ "/" ] = this.directory;
  • this.mappings[ "/flexmark" ] = ( this.directory & "vendor/flexmark-0.32.24/" );
  • this.mappings[ "/javaloader" ] = ( this.directory & "vendor/javaloader-1.2/javaloader/" );
  • this.mappings[ "/javaloaderfactory" ] = ( this.directory & "vendor/javaloaderfactory/" );
  •  
  • // ---
  • // PUBLIC METHODS.
  • // ---
  •  
  • /**
  • * I initialize the application.
  • *
  • * @output false
  • */
  • public boolean function onApplicationStart() {
  •  
  • // In order to prevent memory leaks, we're going to use the JavaLoaderFactory to
  • // instantiate our JavaLoader. This will keep the instance cached in the Server
  • // scope so that it doesn't have to continually re-create it as we test our
  • // application configuration.
  • application.javaLoaderFactory = new javaloaderfactory.JavaLoaderFactory();
  •  
  • // Create a JavaLoader that can access the Flexmark 0.32.24 JAR files.
  • // --
  • // NOTE: This list of JAR files contains the CORE Flexmark functionality plus
  • // the Autolink extension. Flexmark is configured such that each extension is
  • // packaged as a separate, optional set of JAR files.
  • application.flexmarkJavaLoader = application.javaLoaderFactory.getJavaLoader([
  • expandPath( "/flexmark/autolink-0.6.0.jar" ),
  • expandPath( "/flexmark/flexmark-0.32.24.jar" ),
  • expandPath( "/flexmark/flexmark-ext-autolink-0.32.24.jar" ),
  • expandPath( "/flexmark/flexmark-formatter-0.32.24.jar" ),
  • expandPath( "/flexmark/flexmark-util-0.32.24.jar" )
  • ]);
  •  
  • // Indicate that the application has been initialized successfully.
  • return( true );
  •  
  • }
  •  
  • }

As you can see, I'm just pointing the JavaLoader to the list of relevant JAR files. I downloaded these JAR files from the Flexmark Maven project page. Again, the Flexmark project is very modular; so, I only downloaded the JAR files needed to implement the core functionality plus the autolinking.

Once we have our JavaLoader instance created and cached, parsing markdown is fairly straightforward. The process involves a Parser, which takes the markdown content and generates an Abstract Syntax Tree (AST). That AST is then passed to a Renderer, which flattens the node tree down into an HTML string. Both the Parser and the Renderer are built using an Options map, which is how we enable features like the Autolinking extension.

To see this in action, I'm taking some buffered markdown content, parsing it, and outputting to the page (both in active and encoded format):

  • <!---
  • Setup our markdown content.
  • --
  • NOTE: Indentation is meaningful in Markdown. As such, the fact that our content is
  • indented by one tab inside of the CFSaveContent buffer is problematic. But, it makes
  • the demo code easier to read. As such, we'll be stripping out the extra tab after the
  • content buffer has been defined.
  • --->
  • <cfsavecontent variable="markdown">
  •  
  • # About Me
  •  
  • My name is Ben. I really love to work in the world of web development. Being able to
  • turn thoughts into user experiences is just ... well, it's magical. You should check
  • out my blog: **www.bennadel.com**.
  •  
  • I like to write about the following things:
  •  
  • * Angular
  • * Angular 2+
  • * AngularJS
  • * NodeJS
  • * ColdFusion
  • * SQL
  • * ReactJS
  • * CSS
  •  
  • After people read my stuff, they can often be heard to say &mdash; and I quote:
  •  
  • > Ben who?
  •  
  • Probably, they are super impressed with my use of white-space in code:
  •  
  • ```js
  • function convert( thing ) {
  •  
  • var mapped = thing.forEach(
  • ( item ) => {
  •  
  • return( item.name );
  •  
  • }
  • );
  •  
  • return( mapped );
  •  
  • }
  • ```
  •  
  • _**NOTE**: I am using [Prism.js](https://prismjs.com/ "Prism is very cool!") to add
  • syntax highlighting._
  •  
  • ## The Hard Truth
  •  
  • Though, it's also possible that people just come for the pictures of my dog:
  •  
  • [![Lucy the Goose](./goose-duck.jpg)](./goose-duck.jpg "Click to download Goose Duck.")
  •  
  • <style type="text/css">
  • img { width: 250px ; }
  • </style>
  •  
  • How <span style="text-transform: uppercase ;">freakin' cute!</span> is that goose?!
  • I am such a lucky father!
  •  
  • </cfsavecontent>
  •  
  • <!--- ------------------------------------------------------------------------------ --->
  • <!--- ------------------------------------------------------------------------------ --->
  •  
  • <cfscript>
  •  
  • // As per comment above, we need to strip off one tab from each line-start.
  • markdown = reReplace( markdown, "(?m)^\t", "", "all" );
  •  
  • // Create some of our Class definitions. We need this in order to access some static
  • // methods and properties.
  • AutolinkExtensionClass = application.flexmarkJavaLoader.create( "com.vladsch.flexmark.ext.autolink.AutolinkExtension" );
  • HtmlRendererClass = application.flexmarkJavaLoader.create( "com.vladsch.flexmark.html.HtmlRenderer" );
  • ParserClass = application.flexmarkJavaLoader.create( "com.vladsch.flexmark.parser.Parser" );
  •  
  • // Create our options instance - this dataset is used to configure both the parser
  • // and the renderer.
  • options = application.flexmarkJavaLoader.create( "com.vladsch.flexmark.util.options.MutableDataSet" ).init();
  •  
  • // Define the extensions we're going to use. In this case, the only extension I want
  • // to add is the Autolink extension, which automatically turns URLs into Anchor tags.
  • // --
  • // NOTE: If you want to add more extensions, you will need to download more JAR files
  • // and add them to the JavaLoader class paths.
  • options.set(
  • ParserClass.EXTENSIONS,
  • [
  • AutolinkExtensionClass.create()
  • ]
  • );
  •  
  • // Configure the Autolink extension. By default, this extension will create anchor
  • // tags for both WEB addresses and MAIL addresses. But, no one uses the "mailto:"
  • // link anymore -- totes ghetto. As such, I am going to configure the Autolink
  • // extension to ignore any "link" that looks like an email. This should result in
  • // only WEB addresses getting linked.
  • options.set(
  • AutolinkExtensionClass.IGNORE_LINKS,
  • javaCast( "string", "[^@:]+@[^@]+" )
  • );
  •  
  • // Create our parser and renderer - both using the options.
  • // --
  • // NOTE: In the demo, I'm re-creating these on every page request. However, in
  • // production I would probably cache both of these inside of some Abstraction
  • // (such as MarkdownParser.cfc) which would, in turn, get cached inside the
  • // application scope.
  • parser = ParserClass.builder( options ).build();
  • renderer = HtmlRendererClass.builder( options ).build();
  •  
  • // Parse the markdown into an AST (Abstract Syntax Tree) document node.
  • document = parser.parse( javaCast( "string", markdown ) );
  •  
  • // Render the AST (Abstract Syntax Tree) document into an HTML string.
  • html = renderer.render( document );
  •  
  • </cfscript>
  •  
  • <!doctype html>
  • <html lang="en">
  • <head>
  • <meta charset="utf-8" />
  • <title>
  • Using Flexmark 0.32.24 To Parse Markdown Into HTML in ColdFusion
  • </title>
  • </head>
  • <body>
  •  
  • <h1>
  • Using Flexmark 0.32.24 To Parse Markdown Into HTML in ColdFusion
  • </h1>
  •  
  • <h2>
  • Rendered Output:
  • </h2>
  •  
  • <hr />
  •  
  • <cfoutput>#html#</cfoutput>
  •  
  • <hr />
  •  
  • <h2>
  • Rendered Markup:
  • </h2>
  •  
  • <pre class="language-html"
  • ><code class="language-html"
  • ><cfoutput>#encodeForHtml( html )#</cfoutput></code></pre>
  •  
  • <!-- For our fenced code-block syntax highlighting. -->
  • <link rel="stylesheet" type="text/css" href="./vendor/prism-1.14.0/prism.css" />
  • <script type="text/javascript" src="./vendor/prism-1.14.0/prism.js"></script>
  •  
  • </body>
  • </html>

In this demo, I'm re-creating the Flexmark Parser and Renderer on each page load since it makes the development life-cycle very easy. In a real application, however, I'd be caching these objects inside some sort of abstraction (like a MarkdownService.cfc), which would itself be cached in some persistent scope.

That said, there's not a whole lot going on here. In the options object, I'm telling Flexmark to use the Autolinking extension. And, I'm also configuring the Autolinking extension. By default, the Autolinking extension will match WEB addresses (ex, www.*) and MAILTO addresses (ex. ben@bennadel.com). The "mailto:" protocol is so ghetto though (which I say mainly because it requires a native mail app to be configured). So, I am providing a Regular Expression pattern that will get the Autolinking extension to ignore links that looks like email addresses.

Now, if we run this page in the browser, we get the following output:


 
 
 

 
 Using Flexmark 0.32.24 to parse markdown content into HTML output using ColdFusion. 
 
 
 

As you can see, our markdown content was easily converted into HTML and rendered to the page. In this case, I'm using Prism.js to format the fenced code-blocks on the client-side. But, the rest of this is pure markdown to HTML support.

Once again, just a reminder that markdown will allow any embedded code. As such, markdown is not secure on its own. The markdown content has to be coming from a tursted source. Or, the rendered HTML has to be run through a sanitization process, like AntiSamy.

Now, I'm one step closer to being able to enable markdown in my blog comments. And, hopefully this helps anyone else who is interested in processing markdown content in ColdFusion.



Looking For A New Job?

100% of job board revenue is donated to Kiva. Loans that change livesFind out more »

Reader Comments

@All,

One thing I realized last night is that I was losing all the soft-line breaks in the rendered output of the parsed content. That's because, by default, markdown doesn't care about line-breaks inside a single content block. However, given the fact that I am just closing a Haiku content, line-breaks are kind of important.

To get Flexdown to turn soft-breaks into hard-breaks, you have to set some options on the Renderer:

options.set(
	HtmlRendererClass.SOFT_BREAK,
	javaCast( "string", "<br />#chr( 10 )#" )
);

This will use the <br />\n string wherever it encounters a soft-break inside a contiguous content block.

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.