I think application log data is thrilling. It exists in a multi-faceted, complex context where it needs to be both machine-readable for monitoring and alerting purposes; but, it also needs to be human-readable for debugging and problem solving. It needs to contain robust information; but, at the same time, it needs to be properly sanitized to reduce attack vectors and adhere to compliance standards. It needs to be loquacious and actionable; but, hopefully not noisy. Application log data needs to, at times, implement seemingly competing priorities. But, even in the middle of all this complexity - amidst the pull in many directions - I believe that log data should avoid using numeric log levels. To most humans, especially at 3am, a log level of "30" doesn't mean all that much.
I am sure there are specifications on how to do logging. But, a specification is for engineers, not for humans. And, when it comes to logging, it's important to remember how many humans may actually be consuming your logs:
- Product engineers who wrote the code that's generating the logs.
- Product engineers who've never seen the code that's generating the logs.
- Platform engineers.
- Tooling engineers.
- Data analysts.
- Support people.
- Product managers.
- Project managers.
- Engineering managers.
- Database engineers.
- Monitoring and alerting algorithms (OK, obviously not people, but configured by people).
Given the diversity of eyes on the data, I think it's much more likely that we can all understand what a log level of "information" or "error" or "warning" means. And, I would also posit that it's very unlikely that we can all understand (or even agree on) what a log level of 6 or 30 means. And, given the fact that we use logs to debug incidents, sometimes being woken up by pagers in the middle of the night, I think erring on the side of human-readable is an important priority.
I am sure there are many "intellectual" arguments as to why we should use numeric log levels. Like, storing numbers takes up less space than storing strings. Or, using numbers allows us to use greater-than / less-than filtering on logs. Or, using numbers allows for more efficient indices in the log repository.
But, I view logs like a consumable "product". And, like any product, it's the user experience (UX) that should drive the feature-set. That means, when possible, thinking about the user and the use-cases first and the implementation second. So, from my point of view, that means using empathetic, readable, log level strings instead of normal-form, numeric values.