Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
Ben Nadel at the jQuery Conference 2009 (Cambridge, MA) with: Paul Irish
Ben Nadel at the jQuery Conference 2009 (Cambridge, MA) with: Paul Irish@paul_irish )

Grouping The MySQL PROCESSLIST By IP Address To View Connection Counts

By Ben Nadel on
Tags: SQL

I'm a huge fan of SQL. But, I've never been a database administrator. At least, nothing like Brad Brewer - our DBA and general data architect. Lately, however, I've been trying to take a more active role in the monitoring of our database infrastructure; which means, learning more about MySQL features like the PROCESSLIST and SLOW_LOG tables. For example, today, we saw an odd increase in the number of connections coming into the master database. So, in an attempt to figure out which client machine was causing the unexpected increase, I had to figure out how to group the PROCESSLIST table by host IP address.

Now, I've known about the SHOW FULL PROCESSLIST command for a while. But, I had no idea that you could actually run queries against the PROCESSLIST table in MySQL. In the past, I've certainly tried to do this with no luck. Thankfully I came across a really helpful StackOverflow post today that taught me how to execute such a query by denoting the PROCESSLIST as a child of the INFORMATION_SCHEMA.

Once I was able to actually run queries against the PROCESSLIST table, all I had to do was strip the port off the host column and do a little grouping on the naked IP address:

  • SELECT
  • tmp.ipAddress,
  •  
  • -- Calculate how many connections are being held by this IP address.
  • COUNT( * ) AS ipAddressCount,
  •  
  • -- For each connection, the TIME column represent how many SECONDS it has been in
  • -- its current state. Running some aggregates will give us a fuzzy picture of what
  • -- the connections from this IP address is doing.
  • FLOOR( AVG( tmp.time ) ) AS timeAVG,
  • MAX( tmp.time ) AS timeMAX
  • FROM
  • -- Let's create an intermediary table that includes an additional column representing
  • -- the client IP address without the port.
  • (
  •  
  • SELECT
  • -- We don't actually need all of these columns for the demo. But, I'm
  • -- including them here to demonstrate what fields COULD be used in the
  • -- processlist system.
  • pl.id,
  • pl.user,
  • pl.host,
  • pl.db,
  • pl.command,
  • pl.time,
  • pl.state,
  • pl.info,
  •  
  • -- The host column is in the format of "IP:PORT". We want to strip off
  • -- the port number so that we can group the results by the IP alone.
  • LEFT( pl.host, ( LOCATE( ':', pl.host ) - 1 ) ) AS ipAddress
  • FROM
  • INFORMATION_SCHEMA.PROCESSLIST pl
  •  
  • ) AS tmp
  • GROUP BY
  • tmp.ipAddress
  • ORDER BY
  • ipAddressCount DESC

This returns a list of client IP addresses along with the number of connections that the given client is holding.

Honestly, I can't tell you how long I've longed to be able to query against the PROCESSLIST table. Being able to do that now is a simple yet major breakthrough for me in my journey towards SQL mastery. It makes things like grouping the PROCESSLIST by IP address finally possible.




Reader Comments

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.