Wednesday, July 15, 2009

IT Reliability through Business Transaction Management


...Continued from the last post
Enabling Reliable IT – Managing Performance
How do you know when an end user is experiencing bad response times?
  • They call into the help desk to complain – usually only after a number of past events where they were un-happy with the application’s reliability.
  • An end user monitoring tool measures bad response times
End User Measurements
There are a great number of tools on the market today that perform this task in a variety of ways, below is a summary of the different approaches.
Software Based Real User Measurements
Desktop Agent
  • Strength – enables the monitoring of the end user’s desktop and can measure response times for fat client based applications
  • Weakness – must be installed at each desktop
Javascript Injection
  • Strength – no need for end user installation
  • Weakness – Javascript needs to be added to web application code
Browser Plug-In
  • Strength – easy installation without code modification
  • Weakness – still requires end user installation
Network Appliance Based Real User Measurements
All of these solutions utilize a network sniffer installed at a port mirror in order to guess end user response times. The advantage of using this solution is the ease of installation; the disadvantage is the cost of putting these probes at all of the points where the network is accessed and the
accuracy of the data.
Synthetic End User Measurements Performed by “Robots”
The classic availability monitors use this approach - scripts are used to “ping” the system and check its availability. The advantage is that availability can be monitored overnight and before the morning workload, the disadvantage is that real user response times are not being measured and scripts have to be modified with every change of the application.
Finding the Location of the Problem
Now that you know that there is a problem - since the end user is experiencing reliability issues with an application, narrowing down the location of the problem is the next step. Research has shown that 80% of time spent on troubleshooting performance problems is spent on finding the actual location of the problem.

In the picture below “John the User” is experiencing poor response times – the IT department is tasked with resolving this issue.
To be continued next week...

Sunday, July 5, 2009

IT Reliability through Business Transaction Management

I just got back from a regional CMG conference where the majority of attendees identify themselves as IT performance and capacity professionals. Ultimately the objective of their work is to make the organization’s IT systems more reliable.
  • By planning for tomorrow’s capacity they ensure that the applications that our co-workers and customers rely on will continue to be reliable throughout increased usage and new changes.
  • By managing IT performance they enable the day to day reliability of these applications so that revenue can be generated and growth achieved.
By focusing our efforts on the greater goal of IT reliability we are able to heal the great disconnect that happens much too often between the business and IT. Showing the business graphs of increased CPU consumption or bandwidth utilization does not enable them to relate to the performance and capacity issues that they are concerned with. IT resource consumption metrics may be the “vital signs” of IT systems, but this is kind of like checking a patient’s pulse and respiratory rate and saying that they are healthy enough even though the patient may be suffering from a terminal disease. The lights may be on, but are the IT systems you manage perceived as reliable by the end users and the business?

Put yourself in the shoes of the end user; you want any action that you perform within an application to have a quick and valid response. That is reliability in the eyes of the business and the end user and that is exactly what your work as a performance or capacity professional is seeking to enable.

Transactions
What connects between all of the various parts of the puzzle? What links the business, the end users, the network, firewalls, proxy servers, web servers, application servers, load balancers, message brokers, databases, and mainframes? Transactions.
When posing the question "how do YOU define a transaction?" to a room full of IT professionals you are likely to receive a handful of different answers.
Typically one would define an HTTP request, database call, CICS transaction or SOAP request as a transaction. In this paper these are defined as transaction segments that are part of a greater transaction.
A transaction is the most elementary unit of work that can be performed by a user of an application. Whenever a user clicks a button within an application, they have performed a “transaction activation”. This “transaction activation” can trigger any number of IT processes within the datacenter that are used to get the work done. A transaction could be a transfer of funds, a purchase, an update of information, or the opening of a new account – any user interaction with the application – also known as a business service or unit of work.

To be continued next week...

Friday, May 15, 2009

BTM & IT Service Management

From managing incidents to managing changes, availability, security, and compliance, BUSINESS TRANSACTION MANAGEMENT has been leveraged by large IT organizations to augment their ITSM activities.

Let’s look at the impact of BUSINESS TRANSACTION MANAGEMENT to each key ITSM process as well as compliance and auditing activities.

Incident Management – BUSINESS TRANSACTION MANAGEMENT feeds to help desk systems business transaction events and SLA violations (in addition to current component based events).

Problem Management – BUSINESS TRANSACTION MANAGEMENT pinpoints exactly where a problem is in the entire data center topology from a business transaction perspective (instead of wasting hours or days pointing fingers in a war room).

Change and Release Management - BUSINESS TRANSACTION MANAGEMENT assesses in pre-production the impact of change to transaction performance by comparing detailed transaction metrics across builds (providing a more holistic view of performance and more granular drill-down information).

Configuration Management - BUSINESS TRANSACTION MANAGEMENT automatically discovers and models your topology based on real transactions, keeping your CMDB up to date (instead of relying on static modeling).

Availability Management - BUSINESS TRANSACTION MANAGEMENT monitors SLA compliance across all tiers (instead of being limited to specific tiers) – SLA thresholds are defined automatically, based on transaction averages; or can be manually defined.

Capacity Management - BUSINESS TRANSACTION MANAGEMENT monitors transaction volume trends and identifies candidates for consolidation.

Continuity Management - BUSINESS TRANSACTION MANAGEMENT provides business transaction measurements for your disaster recovery testing and failover scenario management.

Security Management - BUSINESS TRANSACTION MANAGEMENT tracks all transactions including the location they are coming from, exactly where they are going and what they are doing – powerful data for both detection of security risks and forensic analysis.

Compliance Management - BUSINESS TRANSACTION MANAGEMENT leverages its full transaction monitoring data to prove regulatory compliance even across virtual servers. And it provides application developers production metrics without access to production (SoD).

Auditing - BUSINESS TRANSACTION MANAGEMENT provides a full transaction audit trail and trending metrics.

Saturday, April 25, 2009

Configuration Management Data Base (CMDB) and BTM

Any organization that has or is planning to implement the ITIL methodology will find great value in a Business Transaction Management (BTM) solution. BTM contributes significantly to the population and utilization of your CMDB, along with the auto creation of a service model relationship within your CMDB.

CMDB Population: Business Transaction Management solutions auto discover your application dependency map by monitoring the real transactions that run through an application's full topology. Additionally, the auto discovered transaction types are then added to the CMDB. Anyone accessing the CMDB that wants to see more specific information about various transactions will be referred directly to the BTM tool's transaction repository.

Enhanced Utilization: BTM solutions enable better utilization of a CMDB by linking true business activities to IT processes, enabling the management of IT from the business perspective. Additionally, BTM solutions connect specific transaction segments and servers – virtual or real – to service degradations enabling rapid resolution.

BTM Enables the Real Time Population of Your CMDB

BTM solutions are designed to operate around the clock in production environments and collect important data on all transactions that flow through all of the IT components in the datacenter. This enables your CMDB to be continuously populated with the critical business processes that are linked to their underlying IT counterparts.

Some examples of Configuration Items (CIs) that a BTM solution can populate a CMDB with are:

Business services (transaction types)

Data flows - the full structure of the infrastructure that supports each transaction - down to the methods that are executed - enabling much more granular impact analyses in change management

The Topology of each transaction type (true service models)

IP addresses of clients and servers

All of the possible physical servers that transactions may run through

All of the possible virtual servers that may be brought up on the fly in response to increasing load

The advantage of defining business relationship CIs with the help of a BTM solution is that those relationships are 100% accurate and are based on the flow of real transaction activations in the production environment, as opposed to other CMDB populating tools which must rely on assumptions, or manual input.

It is now possible to understand which changes affect which transactions – or business services - so that the criticalities of each change can be better understood from a true business perspective.

CMDB and the Help Desk Ticket

A BTM solution will automatically detect slow or hanging business transactions and open an Incident. By doing so, the IT Service Support or Help Desk team can be proactive in dealing with the incident. They can proactively start working on the root cause of the problem and fix it – possibly even before any users feel the negative impact of the degradation.

Tickets that are opened by the help desk are automatically put in the context of the CMDB since the BTM solution links the IT components that are involved in the transaction degradation to that ticket. This not only enables tickets to be sent to the appropriate administrator for resolution, it also enables that administrator to utilize CMDB information to help resolve the issue.

BTM – CMDB Scenarios

A Service degradation

The BTM solution identifies the degradation of a specific transaction type’s SLA

An incident is created and a help desk ticket is opened - flagging the server which is causing the latency - for example – the latency is due to a specific application server

The Help Desk forwards the ticket to the Application Team as part of the problem resolution process

The Application Team can now look at the change history of that specific application server within the CMDB and identify the cause of the problem

A Change is Being Contemplated

A change to a specific application in the application server is being contemplated

The Change Manager identifies all of the real business transactions that are associated with the change

The Change Manager will be able to anticipate the business impact that the change will have on the business users and will later be able to measure the impact with real user measurements

Prioritization of Incidents

Two servers fail at the exact same time

The system administrator knows which server is more critical to the business and can prioritize the recovery - something that is not possible to perform without the true model

Change Verification

Measuring the impact of a change that has been conducted

A change to the application has been implemented

The BTM solution will show the impact of that change by comparing the performance of transactions and transaction segments before and after the change

Roll Back Has Been Performed

The performance of all transactions in the current time frame is compared to the time frame of the last known consistent state in order to verify the roll back

Comparisons

The performance of specific transaction segments on two identical servers is compared in order to verify the consistency of performance

The topologies of two similar applications are compared in order to validate the quality of implementation

A CMDB for the Cloud

Cloud Computing's value proposition to the enterprise is a substantial one - maintaining a CMDB for the cloud introduces new challenges. These challenges are related to the real time scalability of applications in the cloud.

Business Transaction management solutions can at any given moment provide a real time snapshot of the cloud enabling a CMDB to keep up with the constant change.

Additionally, BTM's ability to effectively support a federated environment contributes to the stability and scalability of a CMDB in the cloud.

The CMDB can define a template that is fed in by the Business Transaction Management solution. In this manner the current status of the cloud can be shown – this can only be done with a real time BTM tool.

BTM Brings Value to Your CMDB Investment

Consider the following scenario; an application utilizes twenty different servers within the datacenter, a specific business critical service that the application provides really only utilizes only four of those servers. When building a CMDB with traditional tools it is nearly impossible to crack this scenario since most models are based on theories and assumptions. The only way of achieving this accurate level of granularity with the CMDB – short of going line by line through the application’s code - is by using BTM to help populate the CMDB with real data.

Implementing a CMDB is not a small investment, not to mention maintaining it, utilizing it, keeping it up to date and ensuring its validity. Business Transaction Management solutions enable organizations to better utilize and maintain their CMDB by filling in the gaps that occur in real time. BTM solutions are the only solutions that can link the degradation of a business service to the problematic IT component. This enables the location of the relevant CMDB data needed in order to resolve the problem.

Friday, April 10, 2009

BTM and Real User Measurements (RUM)

A complete BTM solution cannot ignore the “first mile” – the segment from the end-user to the data center. Real User Measurements (RUM) should be part of any complete BTM solution, since the origin of every user related transaction, is at the user.

Enterprises must have the ability to measure the level of service that their users are actually receiving - from their own desktop - and provide fast answers when performance degradation occurs.

Whether response times are degrading, or transaction failures are proliferating, IT staff must provide fast answers.

RUM must address two separate needs:

  • Web based applications used by remote users at their home or on a mobile device
  • Internal corporate users with desktop or Citrix based applications

For home users, installation on the home user desktop is usually not an option. BTM must provide the technology for measuring latencies from the home user’s browser, without installation on the desktop.

Enterprise users, tend to use numerous applications, some based on fat clients (no browser) installed on their desktop. In such cases a local agent has to be deployed to capture every transaction issued by the user in order to measure its latency and successful completion.

Agent based RUM solutions also provide inventory metrics – how many applications are actually being used, what applications are installed and executed within the user’s operating system, and so forth.

Wednesday, March 11, 2009

BTM Benefits Both Business and IT

Business Transaction Management is becoming a strategic investment for any enterprise since it sits at heart of IT management.

With a complete BTM solution, every single transaction is captured and traced; from the moment the end user hits a button in any application, be it browser or desktop based, to the last CICS program call in the Mainframe.

Complete BTM solutions do this within a production environment - are always turned on - in order to assure the reliability of complex modern applications.

The Benefits

BTM enables IT to align with business requirements by making sure every business transaction is processed successfully in a timely manner along with putting every IT process within the datacenter in the context of an end user activated transaction.

End-to-end transaction tracing enables full accountability for every transaction processed within the IT topology. Whether it spans through proxys, load balancers, web servers, application servers - based on J2EE/.NET, C/C++ or even proprietary home grown applications (e.g Cobol) - down to message brokers/hubs, databases and legacy systems within the Mainframe.

No tier is ignored, no transaction is lost. Real time alerts are propagated - proactively - from the moment a performance degredation starts to occur. Root-cause-analysis can always be done in a scientific and accurate manner.

By creating a Performance Management Database (PMDB) – IT staff can now speak to each other in a common language. No more talking about packets, HTTP, SQL, or CICS – but a single transaction that encapsulates all relevant information on how it is doing, where, and when.

Automatic topology mapping is a byproduct of any BTM solution. By tracing the transaction along the IT topology, a complete map of IT relations – what logical component is interacting with what – emerges. At a granularity that has never before been available. Every web server instance, every message broker, every database and every external application that is part of a transaction flow will be located in the relevant path within the topology map. Real time measurements allow constant monitoring of latencies between different tiers, and actual transaction volumes.

Proactive management – Proactive resource allocation management per transaction instance, or in other words prioritization of transactions according to business context, or blocking of transactions/transaction segments according to security requirements, are all part of the potential of the BTM solution.