Lately, I've been putting a lot of thought into the way that I am storing content for my blog entries. As part of this thinking, I've been considering more XSLT strategies to address aspects of my content management system that I've never quite liked. One example of this discontent is over the way in which I display images within my content. All of my images get displayed with a border and a dark, surrounding glow. I accomplish this effect by putting each image inside a TABLE tag that contains many additional TD's used purely for formatting.
The use of a TABLE tag for this formatting is not what bothers me (I'm not that anal about TABLE tags); what bothers me is that all the TABLE XHTML is actually stored in my database as part of the content data. This is a serious merging of my data and my display in a way that I'm not comfortable with. If I could go back and do it all over again, I'd store each image as a simple tag and then replace it on display.
Now, I chose the word, "replace," here for a very specific reason - because my first instinct would be to use some sort of regular expression "replace." As much as I think regular expressions are a supreme gift, they are not the right tool for all replace-type situations. When it comes to XHTML, we're not really looking for text patterns - we're looking for particular sets of nodes within a structured, hierarchical document object model.
It is exactly this type of DOM replace action that XSLT and ColdFusion's XMLTransform() excel at. And, as a first step in this direction, I wanted to experiment with transforming content data, wrapping the IMG tags in a TABLE and then copying every other node as-is. Once I can do a generic copy of XHTML data with a single hand-picked exception, I should be able to extend this functionality to encompass all data transformations desired within the entire set of blog content data.
<!--- Define XHTML style data. ---> <cfsavecontent variable="strData"> <div id="contentarea"> <p> Maria Bello is so awesome. Just look at her in this polaroid picture - you can't just tell she has a great attitude. </p> <p class="image"> <img src="http://farm4.static.flickr.com/3201/ 3069379561_2e8cb1be2c.jpg" /> </p> <p> This makes me want to go and watch A History of Violence again; what an awesome film. She is so wicked hot in it! Oh man! </p> </div> </cfsavecontent> <!--- Define the XSLT. ---> <cfsavecontent variable="strXSLT"> <!--- Document type declaration. ---> <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <!--- Match all generic nodes. ---> <xsl:template match="*"> <!--- Copy this node (non-deep copy). ---> <xsl:copy> <!--- Make sure that all attributes are copied over for the current node. ---> <xsl:copy-of select="@*" /> <!--- Apply templates to all of it's child nodes (so that they can be copied). ---> <xsl:apply-templates /> </xsl:copy> </xsl:template> <!--- Look for any image nodes. We need to take these and format them with our special image display. ---> <xsl:template match="p[ @class = 'image' ]"> <table class="imageborder" cellspacing="0" cellpadding="0" border="0" width="100%"> <tbody> <tr> <td rowspan="3" width="50%"> <xsl:call-template name="nbsp" /> </td> <td class="nw"> <xsl:call-template name="nbsp" /> </td> <td class="n"> <xsl:call-template name="nbsp" /> </td> <td class="ne"> <xsl:call-template name="nbsp" /> </td> <td rowspan="3" width="50%"> <xsl:call-template name="nbsp" /> </td> </tr> <tr> <td class="w"> <xsl:call-template name="nbsp" /> </td> <td class="c"> <!--- Copy the actual image node. Since we don't have any special way in which we want to transform this, we can just apply templates to the child nodes which will call our generic copy template. This is actually a good thing since it allows us to have more than just IMG tags in place (example a LINK tag containing an image). ---> <xsl:apply-templates /> </td> <td class="e"> <xsl:call-template name="nbsp" /> </td> </tr> <tr> <td class="sw"> <xsl:call-template name="nbsp" /> </td> <td class="s"> <xsl:call-template name="nbsp" /> </td> <td class="se"> <xsl:call-template name="nbsp" /> </td> </tr> </tbody> </table> </xsl:template> <!--- Create a named-template for easy NBSP output. By default, the text output escapes certain characters that we actually want to render. ---> <xsl:template name="nbsp"> <xsl:text disable-output-escaping="yes"> &nbsp; </xsl:text> <br /> </xsl:template> </xsl:transform> </cfsavecontent> <!--- Include style shee from site. ---> <link rel="stylesheet" type="text/css" href="content.css"></link> <link rel="stylesheet" type="text/css" href="main.css"></link> <!--- Transfor the XHTML. Let's see if this creates an accurate copy of the XHTML. ---> #XMLTransform( Trim( strData ), Trim( strXSLT ) )#
As you can see, the first chunk of data is my "content." This contains several paragraphs of text, one of which contains just an image. The second chunk of data is my XML Transformation code. As I demonstrated earlier today, XSLT's Copy and Copy-Of commands properly copy XHTML data, so I knew that would work. Then, I have a special template match that is looking for paragraphs flagged as "images." Rather than just blindly copying these paragraphs, this specific template intercepts them and outputs the nested IMG tag within a surrounding TABLE tag.
When I run this code above, I get the following output:
Not only is my content data stored in a very straightforward, data-centric way, but I can still achieve the desired, complex image formatting.
I wish I had known more about XSLT when I first started authoring my blog software; I think it would have totally changed the way I store and output my data. XSLT and ColdFusion's XMLTransform() are really great tools for keeping a strong line between the data and the display of that data.
Want to use code from this post? Check out the license.
I don't know much about XSLT. I'm not sure why you would define the data as XHTML, though. I think the whole point is that it is not supposed to have formatting.
XHTML doesn't have any inherent formatting. It's just a structured XML document. All the formatting is actually provided via CSS and through my XSLT.
Interesting idea, Ben. I can see how you could use this to make code alterations to multiple XHTML pages within a website in situations where changing site-wide stylesheets isn't sufficient.
It's sort of like page templating after the fact. :)
Exactly - you don't "format yourself into a corner," so to speak. Especially with data that has lots of bells and whistles added to it. Like for me, all my code button (View in window, download, download as zip, etc.) are all added during output using regular expressions (a poor choice but all I could think of at the time).
To me, those "utility" links are not really part of the data - they are part of the user's experience of that data and therefore should be added afterward. The XSLT stuff feels like the right approach.
On a related note I found that using XSL/T to parse content was significantly faster than regex - especially when the documents get large.
I found Dave Pawson's site a great XSL/T resource:
It's good to know that XML / XSLT is faster, especially on large documents. I think it makes sense too - especially if you have to make several changes; once you absorb the cost of parsing the XML into an actual document, I think each subsequent edit / transformation becomes inconsequential in an XSLT action. Where as in RegEx, its all string parsing and it gains no benefit from structure.
I need to code some stuff for which i need to store data in xml and display it using cold fusion.Basically I need to show content in pop up divs but need to pull content from xml file.
Please advise me on this.
If you have stuff stored in XML, then yeah, XSLT would probably be a good way to go about it. Of course, if the XML is simple, you can just extract values as well.
Ben, when I try this approach with a <JOBDESCRIPTION> node that contains CDATA, it strips out the <![CDATA...]]> tag and escapes all of the html characters. When I test it by simply adding the CDATA tag to your example, this is what gets returned:
<?xml version="1.0" encoding="UTF-8"?> <JOBDESCRIPTION><p class="content">This is some text with some other formatted text <strong>contained within in</strong>. While this is valid XHTML, I am wondering how it will hold up when put through <em>XSLT</em> node copying.<img src="about:blank" /> Embedded image.</p></JOBDESCRIPTION>
I really need it to be returned like so:
<?xml version="1.0" encoding="UTF-8"?><![CDATA[<p class="content">This is some text with some other formatted text <strong>contained within in</strong>. While this is valid XHTML, I am wondering how it will hold up when put through <em>XSLT</em> node copying.<img src="about:blank"/> Embedded image.</p>]]>
I've been working on this for days now trying everything I can think of and I'm starting to look like I don't know what I'm doing to the bosses here. I'd appreciate any help you can give me. Ray
When you say that you are trying "this approach", what do you mean exactly? Are you saying that you are transforming XML with XSLT?
Did you try disabling the output escaping:
Try looking at this post:
There, I am using CDATA-escaped HTML and am outputting it using the disabled outputting escaping technique I mentioned in my previous comment.
<table class="imageborder" cellspacing="0" cellpadding="0" border="0" width="100%">
cruft could be moved into the style sheet of the html output
It's funny - table-based CSS is still something I have yet to fully wrap my head around. Certainly, I do things like padding and alignment in CSS; but when it comes to the border-collapse stuff, I have some weird mental block. I can't explain why.
Also, I think I have seen some strange interaction with CSS-based "width" values. This might be a hold-over from some of the older browsers, but I think I had a few situations where CSS-based "width" interacted with the box-model where attribute-based "width" did not. I could toootally be mistaken on that, though.
On the topic of table CSS, one thing that I have noticed that I can't figure out how to get rid of: THead/TBody margins. In Chrome, I get a space between the THead and TBody content areas. This doesn't seem to happen in any of the other browsers. Weird.
Instead of using XMlTransform(),is there any way to display the results in browser?May be using JSON.