What is NewRelic, and why operations and developers should all be looking at it

Published in

CAMS Engineering

4 min readAug 19, 2018

Well well well, just to set the theme straight here right at the beginning, I do not usually advocate for commercial products out there. I believe the number of tools available for a technical task should always be more than one, and that the decision on which to pursue would represent the “personality” of the individual/team working on the task — more about this in the future. Yet I have started this post with excitement, not for puesuading you to choose NewRelic and abandon other options out there, but for our initiative to advance in the next level of DevOps culture, that is democratising operations to the entire team.

So what is NewRelic? I would like to think that it is a SaaS product that purposefully built for modern technical operations, troubleshooting and data analytics. The NewRelic platform consists of a suite of products that serves different purposes, but interlinked together to provide the power of correlating all of the data together. Its products we use currently includes:

Application Performance Monitor (APM)
It contains the data and metrics of your application.
Browser
This is for webapp only, it contains data from the browser of your end user.
Mobile
Contains data from native mobile apps such as iOS and Android app.
Infrastructure
It is for servers and infrastructure platforms (such as Kubernetes).
Synthetics
It contains scripted tests as simple as health check, to verifying API responses, to tests that runs with a browser.

It is a helpful tool that comes with a price. At carsguide we are fortunate to be able to use most products in the platform without worrying about breaking the bank. This is one of the reason we make NewRelic an important part of our toolkit and it really helped us in many ways.

Data provided by NewRelic is multi-dimentional. By creating APM monitoring agents that hooked onto the framework your application is running on, such as PHP, node.js and Java, it collects data about how your application runs at a rich and deep level, without needing any change to your application code. If you are willing to do a little extra work, it does provide SDKs. You may control certain agent behaviour and enrich data collected in APM with code if you wish to. With minimal configuration, your application’s performance, dependent services such as databases, throughput and error data will start to show up in the web console satisfyingly.

Application behaviour change detected by NewRelic

With data that taps into the very detail of application behaviour, it is interesting not only for operations but also for developers too. In late May this year the development team working on our publishing system deployed a change that generated a ton of excessive SQL updates to the database. The issue resides in a part of the system that web traffic from general public did not trigger any symptom, and the production system is powerful enough that the overall system performance has held up without a significant degrade to trigger an alarm. But we do observed anomalies in our database metrics.

It took the operations team several weeks to establish the trend. NewRelic is the final and definitive piece of data that proves the issue exists — The throughput of SQL update would skyrocket to more than 6000% of the ordinary value on business days but not on weekends. This matched the time our editorial team uses backend functions that are not available to the public. The problematic behaviour would repeat every week after that deployment.

Developers will be interested to see the error stacktrace captured by NewRelic

Right after we presented the development team with the findings, they are able to quickly identify the issue and produced a fix in the shortest period of time. They have locked down the changeset in the past deployments (multiple deployments have been completed after the problematic change was introduced) with the history metric timeline provided in NewRelic. And they are able to pin point the exact part of code that triggered the issue with the drill down data of the offending query class.

But this is not the end of the story. A learning of this event, is operations should not be the only group that is watching metrics. This kind of thought usually is not very popular among developers — who would want to stare at a bunch of charts frequently out of no particular things to look for? But with the power of NewRelic, this could potentially become something interesting to them:

What is the performance impact of my change?
How will my function design reacts under load?
What are the potential improvements I can make to the system?

At carsguide we know DevOps is not just CI/CD. It is a mindset that lives in every member of the technology team. And it is also the reason we believe operations and developers should all be interested, and looking at, the metrics of production systems.

What is NewRelic, and why operations and developers should all be looking at it

Written by Duran S