At InVision, we recently performed a large infrastructure upgrade. And, after we did this, we noticed that image thumbnailing times and CPU-load shot through the roof. Suddenly, we were seeing massive CPU usage (80%-100%) and thumbnail times that took up to 45 minutes on average! It didn't seem to affect all images; but, when it did, it slowed down everything on the machines.
| || || |
| || |
| || || |
When this was happening, all of the stack traces looked like this:
- locked <0x6fcba56b> (a sun.java2d.cmm.kcms.ICC_Transform)
After a good deal of Googling, I was able to find some blog posts and forum threads that seemed to be talking about the same problem (although, I am not entirely sure it's all about the same issue - my Java chops are noobish):
- JVM is stuck inside cmmColorConvert
- Resample image - drawImage exception - Java
- 20x slowdown in PNG processing when switching from JDK 1.6.0_17 to 1.6.0_18
- IV19562: CRASH IN JAVA_SUN_AWT_COLOR_CMM_CMMCOLORCONVERT() IN LIBCMM.SO
From what I can piece together here, it seems like certain image operations (such a resizing) on certain PNG images require color profiles to be converted. And, there's a bug in the JDK that doesn't handle this operation well when done under high load. One of the links above talks about it being a concurrency problem and a synchronization bug in the JDK.
In one of the other threads, one of the participants talks about why color profiles might need to be changed on the fly:
In your case, the encoded color model seems to differ from the original one which is why the ImageEncodingHelper (from XML Graphics Commons) falls back to encoding sRGB data which has to go through color management. And that's what's making the process very slow. Every pixel is converted to sRGB by the color profile associated with the BufferedImage that has been built from the PNG file.
So, it seems like there might be an already-slow process that is being further aggravated by a concurrency problem in the color management module in the JDK that falls-over under load.
We're currently running on ColdFusion 10; but, it sounds like its a problem with the underlying Java libraries, not with ColdFusion itself. And, I wish I could tell you that I found a way to fix this in ColdFusion; but, I didn't. We ended up converting [almost] all of our image thumbnail processing into ImageMagick operations on the command-line. Which, by the way, works beautifully!
Anyway, no solution here; but, if others run into the same problem, hopefully this data can help point them into the right debugging direction.
As a cross-reference opportunity, Charlie Arehart has a great blog post on other reasons that ColdFusion image processing can be problematic:
I have personally seen great benefit from changing the interpolation algorithm in previous debugging attempts.
Glad to see someone else has run into this! Thought we were alone in the CF world. I found it only happens in Java 1.8. I also have found the CPU usage issue happens when processing Office documents.
Because these processes are critical to our apps I've had to downgrade back to Java 1.7 until the issue is fixed.
I submitted this as a bug to Adobe and they've verified it.
Ha ha, glad to shoulder the burden with you :D We're actually still on Java 1.7 (something). So, it seems like it might be a combination of things that create the perfect storm. So frustrating!
CF is using a very old implementation... check out https://github.com/thebuzzmedia/imgscalr it's really good...
Interesting. We were running our testing servers on an early install of CF11 for nearly a year to iron out any "upgrade issues". This version installed Java 1.7.
When we finally moved CF11 to our production site we started with a fresh CF11 download which, unbeknownst, now installed with Java 1.8. That's when we started encountering all kinds of weird issues like this one we hadn't seen before. Downgrading to 1.7 has cleared most of them up.
Yes, frustrating, indeed.
I'm still using ColdFusion 9 w/Windows and our solution was to avoid using Java and use Jukka Manner's C++ CFX_OpenImage tag:
DISCLAIMER: A Linux/Unix port is unavailable.
My personal experience with the Windows 64bit tag has been that it's faster when it comes to retrieving the image dimensions, doesn't crash when encountering CMYK images, rescales images faster, generates images with smaller filesizes and doesn't care which version of Java is being used (Cf9, 10 or 11).
Hey Ben, cool stuff (well, bummer to see, but thanks for sharing).
Since you see it happening on 1.7, but others see it happening on 1.8, it will be interesting to see if it ends up really being in the JVM, or in CF, or perhaps in some library that CF uses.
To that point, you don't mention what update of CF10 you were on. Also, Paul, you say you were on 11, but you don't say if you had this specific issue. Did you? Or does anyone else reading here? And if so, what update of CF11 was that?
It would be helpful to know if it's a problem on the latest updates of 10 and 11, since Adobe has indeed been knocking out lots of bugs in the last couple of updates.
Sorry, I don't have my Java dumps to see if what I encountered was this 'exact' issue, but it sure sounds the similar. We first noticed when using CFPDF to create thumbnails and have also seen CPU usage and CF/IIS timeouts when processing Excel files with CFDocument.
We are saw the issue with CF11 update 4 on JVM 1.8. Downgrading to JVM 1.7 cleared up the issues.
Adobe verified the bug I submitted here:
Good questions - here's the information from one of the node's CFIDE info page:
Server Product: ColdFusion
Tomcat Version: 18.104.22.168
Operating System: UNIX
OS Version: 3.13.0-36-generic
Update Level: chf10000015.jar
Adobe Driver Version: 4.1 (Build 0001)
Java Version: 1.7.0_71
Java Vendor: Oracle Corporation
Java VM Version: 24.71-b01
Java VM Name: Java HotSpot(TM) 64-Bit Server VM
Java Class Version: 51.0
For all I know, this problem has been happening locally (in development) for a long time; but, there's so little load and no FusionReactor in development.
Also, from what I was reading, I am not sure that what I am seeing is a "ColdFusion bug", per say. In so much as it seems like it might be a problem with the Java image libraries. Of course, since ColdFusion is consuming them, it could be called a bug; and, if ColdFusion could find a way around that, internally, that would be great.
Of course, since I barely know thing-one about Java, take whatever I saw with a grain of salt.
You might also want to check out GrahphicsMagick (http://www.graphicsmagick.org/). Faster processing and fewer resources; plus it uses a very similar syntax being that it's derived from an earlier version of ImageMagick.
I've heard really good things about GraphicsMagick as well. I only ended up using ImageMagick because that's what we had installed on all the servers. I didn't want to burden the Ops team with having to reprovision the servers. But, I'll definitely check it out for personal interest.