Skip to main content
Ben Nadel at NCDevCon 2011 (Raleigh, NC) with: Christian N. Abad
Ben Nadel at NCDevCon 2011 (Raleigh, NC) with: Christian N. Abad@AbadChristian )

Using Unicode And Special Characters Within The content Property In CSS

By on

I'm in love with the Right Arrow character. That's the HTML entity, →, or the Unicode character U+02192. It renders like this (โ†’). I've started to use it a lot with various calls-to-action in my user interface (UI). But, I don't always want it to be part of the DOM (Document Object Model) since it's just decorative. So, I sometimes define it in the CSS using the content property of a pseudo element. But, I can never freaking remember the proper escape sequence for Unicode. As such, I wanted to put this post together as a note to self that I can quickly look up the next time I can't remember how to use Unicode and HTML entities within the content property in CSS.

Run this demo in my JavaScript Demos project on GitHub.

View this code in my JavaScript Demos project on GitHub.

To use Unicode in the content property, I need to use \ followed by the hexadecimal code point value. So, for the right arrow (U+02192), I would use:

span:before {
	content: "\2192" ;
}

This Unicode escape sequence format appears to swallow any space character that immediately follows the code point. Meaning, if I wanted to include the Copyright character (U+00A9) followed immediately by the year, I could do this:

span:before {
	content: "\a9 2022" ;
}

The space between the Unicode escape and the "2022" year won't be rendered - it's there only to prevent the "2022" token from being interpreted as part of the Unicode escape. If I wanted to render a space, I'd have to include two spaces in the content property:

span:before {
	content: "\a9  2022" ; /* Note the two spaces. */
}

To see this in action, I've put together a small demo in which I use a few Emoji as part of the content property:

<!doctype html>
<html lang="en">
<head>
	<meta charset="utf-8" />
	<title>
		Using Unicode And Special Characters Within The content Property In CSS
	</title>
	<style type="text/css">

		li:nth-child( 1 )::before {
			content: "Emoji: \1f600"  ;
		}
		li:nth-child( 2 )::before {
			content: "Emoji: \1f618"  ;
		}
		li:nth-child( 3 )::before {
			content: "Emoji: \1f628"  ;
		}
		li:nth-child( 4 )::before {
			/**
			* In this case, the Copyright symbol is followed by a numeric year. In order
			* to prevent the year from being parsed as part of the Unicode escape
			* sequence, we have to include a single space. This space is NOT RENDERED as
			* part the content. If we did want to render a physical space, we'd have to
			* include two spaces (one as part of the escape sequence and one to render).
			* Or, we could break the "content" attribute up into two quoted values.
			*/
			content: "Ben Nadel \a9 2022"  ;
			/* Using two quoted values without a space. */
			x-content: "Ben Nadel \a9" "2022"  ;
			/* Using two quoted values with an additional, rendered space. */
			x-content: "Ben Nadel \a9" " 2022"  ;
		}
		li:nth-child( 5 )::before {
			/* To render back-slash, we have to escape it. */
			content: "Escaped \\2022 back-slash" ;
		}

	</style>
</head>
<body>

	<h1>
		Using Unicode And Special Characters Within The content Property In CSS
	</h1>

	<ul>
		<li><!-- Content to be inserted via CSS. --></li>
		<li><!-- Content to be inserted via CSS. --></li>
		<li><!-- Content to be inserted via CSS. --></li>
		<li><!-- Content to be inserted via CSS. --></li>
		<li><!-- Content to be inserted via CSS. --></li>
	</ul>

	<!--
		For funzies, I wanted to try and output the Unicode characters in JavaScript /
		console.log() as well. Just cause I know I'll forget this in the future.
	-->
	<script type="text/javascript">

		console.group( "Echoing the Computed CSS content Property" );

		for ( var node of document.querySelectorAll( "li" ) ) {

			console.log( getComputedStyle( node, ":before" ).content );

		}

		console.groupEnd();

		// --------------------------------------------------------------------------- //
		// --------------------------------------------------------------------------- //

		// See all about Unicode in JavaScript here:
		// https://dmitripavlutin.com/what-every-javascript-developer-should-know-about-unicode/#3-unicode-in-javascript
		console.group( "Logging Unicode to the Console" );
		console.log( "Emoji: \u{1f600}" );
		console.log( "Emoji: \u{1f618}" );
		console.log( "Emoji: \u{1f628}" );
		console.log( "Ben Nadel \u{a9}2022" );
		console.log( "Escaped \\2022 back-slash" );
		console.groupEnd();

	</script>

</body>
</html>

As a bonus note-to-self, I'm also attempting to use the Unicode values in a console.log() statement as well. And, when we run this in the browser we get the following output:

Unicode characters rendered in the HTML using Unicode escape sequences within the CSS content property.

And there you have it! Unicode characters are being rendered in the DOM (Document Object Model) via the CSS content property by using the Unicode escape sequence. Hopefully, by writing this down, I won't forget.

Why Not Just Include Unicode Characters in the Code File?

This may sound silly or anachronistic. And, maybe I'm just getting old. But, I don't like having any characters in my code that I can't represent with a key-stroke on the physical keyboard. I know that there are Unicode character keyboard utilities in Mac (CMD+CTRL+Space) and Windows; but, I feel emotionally constrained by my physical keyboard.

Want to use code from this post? Check out the license.

Reader Comments

2 Comments

Hi Ben,

I always enjoy your journeys into whatever corner of tech that sparks your interest. If you want to dive deeper into emoji (not the movie ๐Ÿ˜‰) - I had a lot of fun looking at modifiers (works a bit like ligatures).

Example:

  • ๐Ÿ‘ฉ = \1F469
  • ๐Ÿฝ = \1F3FD
  • \200D (zero width join)
  • ๐ŸŒพ = \1F33E

When combined you'll get:

  • ๐Ÿ‘ฉ๐Ÿฝโ€๐ŸŒพ = \1F469 \1F3FD \200D \1F33E
body:before {
	font-size: 2em;
	content: '\1F469 \1F3FD \200D \1F33E'
}

Here is a very simple pen that uses CSSVariables to hold the unicode values (it makes it a bit easier to use and remember)

https://codepen.io/jakob-e/pen/QWbbKgr

All the best to you and Lucy ๐Ÿค—

15,192 Comments

@Jakob,

That's a really cool CodePen! This is some next-level Emoji stuff. It took me years just to stop calling them "Emoticons" ๐Ÿคช Now, I "Emoji" at a 1st-grade level. Baby steps. But, this is really interesting. Is this a universal thing? Meaning, is this the way Emoji were originally designed to work? Or, is this platform-specific? I don't know enough even about the concept of ligatures, but I believe the combination of characters is baked into the concept of how Unicode works, right? Like an "e" next to an accent character kind of thing.

Anyway, thank you for the kind words as well. Very interesting stuff!

Post A Comment — I'd Love To Hear From You!

Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.