Friday, November 28, 2008

Linking Business to IT

You may have heard recently about the new paradigm of managing your IT by linking it to your Business. The first question that arises when one first hears of this statement is; what do you mean? My IT infrastructure is part of my business; it is integral to it. What is the meaning of linking my IT to my Business?

Business Transaction Management – The Link

What is your IT essentially? It is an infrastructure of hardware components that interact in order to provide a service or application to your users, those users are either paying customers or employees who in turn provide services or products to paying customers. Those users are the business, they are the source of revenue that enables the business to thrive or dwindle.

Business Transaction Management solutions provide the link between all of those processes and interactions that run throughout your IT and the business' source of income, the individual transaction activations.

How does Business Transaction Management do this? The concept is simple, but the means have been sought after for many years (and have finally been realized). Every single piece of data that enters or leaves every single server, along with the resource consumption of every process in every server is linked back to a transaction activation. In this manner, not only can you connect between the different "dots" within your data center, but you can connect to the most important dot of them all, the User.

Any problem that arises within the system will always be identified by the user (unless it is a false alarm), with the power to "connect between the dots" and see every parameter and statement within its business context (the transaction activation) resolution of problems becomes child's play.

The following video gives a clear view of the importance of Business Transaction Management and its ability to link Business and IT:


Wednesday, November 12, 2008

The BTM Business Case for Retailers

Convincing retailers to invest in Business Transaction Management is a big challenge.

  • The margins in retail are rather slim
  • The impact of a single transaction is low (compared to the Finance Industry for instance)
  • Competition is fierce due to today's retail search engine proliferation
  • Expanding into online sales has been enough of a challenge as it is
  • ROI has to be clearly shown in order to justify the investment

I have taken the liberty to try and come up with a business case for why retailers should consider a BTM solution; I truly believe that those that will take the step will find it hard to understand how they lived without Business Transaction Management. How can you run a business that is so transaction dependant and not be able to make the connection between business and IT?

Figures for this case study have been inspired by Aberdeen Research's well cited APM study from June 2008 (available for free). I had a tough time finding any analysts, or vendors that published concrete numbers on the retail industry. If you have seen anything, please let me know.

ROI case for Business Transaction Management

A large retail company with an annual revenue of $563M in both online sales (30%) and through its chain of nationwide stores was unsatisfied with the performance of the business critical applications they were using. They were unable to identify issues before they impacted end users, their applications were increasing in complexity, and they were unable to test the performance of their applications whose development had recently been moved offshore.

    Growth in their online division had forced them to rapidly increase capacity over the past few years and now they were stuck in a situation where they had very heterogeneous infrastructure, from the old legacy systems that they had implemented in the early nineties, to their newer SOA based applications, things were such a mess, that they were estimating that IT employees were spending 30% of their time just searching for the cause of the problems, let alone resolving them. The Retailer had also estimated a 5-7% revenue loss due not only to their customer facing web applications, but also due to productivity issues with their CRM and ERP applications. They also calculated total downtime cost per hour to be $80,000.

In order to increase productivity, customer satisfaction, brand image, and to reduce the lost revenue opportunities (in online sales) the Retailer had come to the conclusion that the monitoring tools that they had implemented for each one of their components was not enough, and that it was time to invest in a business transaction management tool. They were looking for a tool that would give them an understanding of the business context of their problems. They needed a single solution that would work for both production and pre-production testing in a way that developers could gain visibility into the interactions between their different applications. Their current tools were retrieving large amounts of data at every node, but the information was useless when it came to increasing productivity and sales, they needed something that would provide visibility across their entire infrastructure. Most importantly they didn't have time to waste weeks or months in order to implement one of the solutions provided by the larger vendors, and didn't want to have professional services people modifying their code. They wanted to go with a solution that was not service intensive, and wouldn't cost them an arm and a leg, due to the tight margins that the retail industry was as a result of a tightening economy and rising fuel costs.

For all of the reasons stated above, the Retailer should go with a Business Transaction Management Solution. In order to justify this claim, the following presents a hypothetical continuation to this case.

The organization had requested an initial BTM POC for their online division, in order to better understand the potential ROI that could be yielded from increased sales revenue which is easier to quantify, unlike increased employee productivity and mean time to repair which is tougher to justify.

On the first day of the POC, the BTM agents were already in place and transactions were being monitored from the user end all the way through the webservers, app servers and to the legacy back end. The value became clear immediately, during peak hours, numerous transactions had been taking much too long. The retailer had always wondered why it was experiencing a high number of abandoned shopping carts online. One guess was that users that were shopping the popular search engines ended up opting for competitors that were able to be more responsive to customers. Not only that, various members of the IT department were able to connect the dots on many of the performance problems that they were experiencing and resolve them immediately.

    A full installation was put into place and suddenly issues with the application's performance could be pinpointed immediately without the need to perform long grueling meetings between the IT department's professionals. In the first few months of using their BTM solution, a steady decrease in empty shopping carts could be charted. They were able to achieve a 600% average improvement for response times on business critical transactions. Developers were very satisfied to have a solution that they could directly log on, test the changes and gain immediate insight as to how the overall application performance had been affected. The MTTR was decreased and problems could be spotted proactively before they reached the customer at an 85% improved success rate.

    Since then, the retailer has expanded the scope of their BTM solution to include its internal applications. They were able to raise employee productivity since they saw an average improvement of application availability of 80%, and best of all, IT headcount, did not have to be increased. The Retailer has also been able to increase capacity exactly where it needs it; the justification for increased capacity becomes easy when IT problems are put in the business context.

If you can think of any way to make this study more robust, please write!

Saturday, November 8, 2008

Transaction Monitoring – Network Appliances

Yet another way to implement Transaction Monitoring solutions is via a Network Appliance. This approach is defined here as any approach that collects data by non intrusive "Network Sniffing". Two good examples of vendors that provide this type of solution are B-Hive and Correlix.

How it Works

Network appliance solutions usually connect to a port mirror in order to collect the traffic, and then try and re-construct the entire transaction. Information needs to be collected directly from every node that is of interest.

Applications

  • Any application where transaction latencies need to be monitored in a production environment
  • Managing SLAs
  • Systems which cannot be tempered with at all and need a "plug and play" solution

Advantages

  • Zero Overhead
  • Full time monitoring
  • Immediate installation
  • Instillation only concerns the network administrator
  • There is no risk of crashing the system ( some "Deep Dive" solutions will cause system failure if they are used to monitor too many transaction due to high overhead)

Drawbacks

  • Uses an algorithmic approach to track transactions which limits accuracy of metrics, latencies are not right, you do not know the accurate flow of the transaction through different tiers and so on.
  • Tracking is not really end-to-end since you cannot see what is actually happening within the servers (cannot achieve full visibility)
  • Even if you collect data from all nodes, correlating that data into a single transaction path (or topology of the entire transaction) accurately has yet to be done (if you can give a concrete example, then let me know and I will post it)
  • Receiving data at the network level makes measuring encrypted data close to impossible
  • Once an event has begun processing, it cannot be controlled (say for resource allocation purposes)

Disclaimer

When trying to give an understanding of a general approach to a solution, all potential advantages and drawbacks (which people who develop or promote the specific solution would prefer to ignore) are listed. Comment with any objections (as people have done in the past) and I will at some point post everything.

Wednesday, November 5, 2008

Business Transaction Tracing – BTM’s Unused Synonym

Let us start off by thanking all of those that contributed their very informative comments on the last post - Transaction Management and "Deep Dive" Java/.NET Profilers. It is great to have feedback from CA Wily, Dynatrace and Jinspired, the objective of this blog is to raise awareness about Business Transaction Management and to set any myths or rumors right, so keep those comments coming.

Gartner's Definition of Business Transaction Tracing

One definition that was left out of the Business Transaction Management Definition post was Gartner's.

In their white paper titled "The Four Dimensions of Application Performance Monitoring" Gartner labels "Business Transaction Flow Tracing" as one of the four "functionalities have emerged to circumvent some of the APM difficulties associated with modular, distributed, interdependent and context-sensitive applications".

Will Cappelli leads off the definition by stating how when a problem with the availability of an application pops up, monitoring component-level health is less helpful when it comes to determining the root cause, and "Used in conjunction with an application dependency map, a report showing a cluster of component latency degradations could be used to guess at the source of the performance issue. More often than not, an insufficient number of components are instrumented and/or the topology plus performance degradation is too ambiguous to be helpful."

Mr. Cappelli then continues to state that Business Transaction Tracing fills the Application Performance Management void that simply monitoring component-level health leaves by following these steps:

  • "First, members of the operations or application support team would be required to instrument path-critical components in the stack and infrastructure, supporting the application being monitored with what amount to sensors."
  • "Second, they must define, package and mark a sequence of interactions at an application's interface — defined as a "business transaction." An instance is executed and the mark is passed through the application's components as it is exercised and sensed, and progress of its path is reported on in real time or near real time. This makes it possible to trace a performance problem's root cause, particularly when used in conjunction with health statistics gathered by the third type of APM functionality."
  • "Finally, it would, once again, be prohibitive to place sensors on more than a few components. Thus, having a good application dependency map is critical to the effective deployment of this type of APM functionality."

Once again, this is not a clear cut definition of Business Transaction Management, but another thing to think about when attempting to provide a definition.

What is Business Transaction Management?

Please help define Business Transaction Management – post your comment!

Saturday, November 1, 2008

Transaction Management and “Deep Dive” Java/.NET Profilers

When an Enterprise finally gets the wakeup call when their applications are performing under par they start looking into Transaction Monitoring (or Transaction Tracking/Tracing) solutions. One type of solution is the "Deep Dive" Java/.NET solution which is defined as those solutions that use Bytecode Instrumentations (or Java/.NET Hooks) in order to collect thorough code level
metrics for J2EE/.NET experts. These Application Performance Management solutions are used throughout the entire lifecycle of the product; they are a strong tool for the developer, but a very weak tool for the production environment since they are unable to monitor all of the transactions on all tiers all the time due to very high overhead.

Who Offers These Solutions?

These solutions tend to be offered by the larger corporations:

  • CA Wily – Introscope
  • HP – TransactionVision
  • BMC – Application Problem Resolution (Identify)
  • Dynatrace – PurePath
  • Precise – APM

Overview

These tools provide deep diagnostics into Java/.NET applications – to the code level. They are used by J2EE/.NET experts in order to locate problems before deployment. These solutions are too low level for use by most operations teams and system administrators as they extract a glut of data and do not enable a high level view of the system, on the other hand, application teams rely on them for development, and they can be use in production to a certain extent in order to monitor synthetic transactions or a small percentage of the real transactions that are flowing through the system.

How they Work

Bytecode Instrumentations (or Java hooks) retrieve data from the nodes that are running J2EE/.NET applications. This is done by utilizing the class loading mechanism of the interpreter (JVM for J2EE or CLR for .NET) that in order to intercept specific classes or method calls within the application.

Applications

  • Gives J2EE/.NET experts insight into where the problems are
  • Used mainly in the development phase and pre-deployment
  • Can be used in production for a few percent of the transactions

Advantages

  • Gives developers deep insight into problems at the source code transaction data level
  • With the help of synthetic transactions, deep diagnostics can be performed during production
  • You can get a full method call, similar to a debugger

Drawbacks

  • Lengthy Implementation
  • Only works with certain environments
  • Cannot trace all transactions in real time
  • Not recommended for the production environment
  • Difficult for IT support staff to use
  • It only helps with Java or .NET
  • The solutions are not designed for a high level production view, they do not provide an extensive topology of the system
  • Lots of detailed data is collected. application owners and system administrators do not always know what to do with all of the information

Business Transaction Management

Although the list of drawbacks is long, the objective of this article is not to bash on this kind of solution (really, they will do wonders for your application development team), it is simply to help you understand that if you are looking for a solution that will cover all of your bases during production; these kinds of solutions won't cut it (they provide up to 10% sampling for limited periods of time). These solutions cannot monitor the entire topology of each transaction even though they claim to be end to end, these traits are by design, and no amount of marketing hype will enable these products to solve all of your problems as they claim to do.