|
Comments
|
Today's Top SOA Links
Enterprise Using Metrics to Optimize J2EE Application Performance in Production
The payoff is substantial
By: Brad Micklea; Geoff Vona
Aug. 10, 2005 09:00 AM
Despite the increasingly widespread adoption of J2EE for enterprise applications, measuring their performance in production continues to be a black art. Without knowing what to look for, many people measure anything that seems useful, which soon results in an overloaded system and reams of meaningless metrics data. It's tempting to just throw xup your hands and start making system changes based mainly on hunches.
Basic Metrics
Throughput, the number of transactions executed by the system over a period of time, is a good indication of the system's ability to handle load. Resource Utilization, or how heavily a particular system element is being used, is the easiest metric to understand. It's not necessarily the most useful for determining system performance because only the utilization of contended resources has a significant impact on performance. We have found it more useful to define a new metric called Service Demand, calculated by the following formula: Service Demand = Utilization of the resource / Throughput Service Demand looks at resources in terms of the demands being put on them. This gives us a clear idea of the utilization of a resource as the users' demand for it increases.
Interrelating Metrics The first thing to notice is that throughput and response time are often at odds. For an interactive application in production, we typically want to maximize throughput, as long as response time is at or below some threshold. For example, we may find we can achieve a maximum throughput of 100 transactions per second while keeping response time at or below two seconds. Figure 1 also shows that resource utilization typically controls system behavior. It is resource contention that causes the dramatic drop in throughput and the commensurate rise in response time that marks the Buckle Zone. The effect of the Buckle Zone is a severe drop in application performance, due to the system spending most of its time managing resource contention, rather than servicing requests. It's important to see how your application behaves at each of these three zones and the specific metric values that cause the system to shift from light to high load to buckle zone. This will be particularly useful for setting alerts in your production monitoring tool.
A Model for J2EE Systems Figure 2 also shows how the metrics discussed earlier can be measured at each level. We'll need to measure and understand response time dispersion at the client-handling level, the application code level, as well as for services like JDBC. Throughput is most important at the client handling level - we have to understand how much user load our system can handle. Resource utilization is measured at many points in the system (OS, execute threads, services, etc) so that we can correlate the information and see how different elements in the system are affecting each other.
Points of Measurement and Overhead There are two methods for measuring client response time: browser scripting and injection of synthetic transactions. Browser scripting is usually implemented through JavaScript in the HTML pages returned to users. While the best measure of the user's experience, it does present several significant difficulties: it's very hard to measure all the clients reliably; deployment and maintenance of the scripting code for those pages can become difficult and tedious. Synthetic transactions address most of these shortcomings and have become more commonly used. The idea is to inject synthetic or scripted user transactions into the system with some tool. These transactions can be easily measured and give a good approximation of what the users are experiencing. It's important to realize their limitations - unless the injector is placed near the end users, rather than just outside or inside the firewall, it cannot provide data on the wider network effects; also, creating realistic synthetic transactions does require a fairly detailed knowledge of user/site interaction and the patience to accurately model this interaction. However, the control that synthetic transactions provide overshadows these limitations. Operating system (OS) metrics, familiar to most developers, will be gathered from machines throughout the J2EE system. Seeing the shifting patterns of CPU, memory, and disk usage on the various tiers of the system greatly aids understanding. But to build that data into a useful J2EE system model, you have to be able to accurately associate system metric information with application container and application code data at specific intervals in time. Only by doing this can you draw a picture of the complex interactions between the application, its application server container, and the underlying system. Reader Feedback: Page 1 of 1
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week |
|||||||||||||||||||||||||||