http://www.itsmwatch.com/itil/article.php/3671721/Bringing-ITIL-to-Life-Automating-IT-Capacity-Management.htm
Back to Article
|
|
|
|
By Drew Robb Apr 13, 2007 Facing hundreds of servers supporting vital business functions, capacity management automation has become a must. The sheer volume of information required to do capacity management in todays highly complex IT infrastructures makes automation a necessity, said Ed Holub, an analyst at Gartner. Even with automation in place, however, there still is a lot of effort required by senior IT professionals to effectively manage capacity. In capacity planning, data collection tools are first put in place. This enables organizations to gather performance and capacity data, which can be analyzed to build a baseline view of where the current infrastructure stands. With this in hand, the organization can better understand existing business plans and their potential impact on the existing IT infrastructure. This highlights any shortfalls based upon the prediction of future resource needs. With these results reported to management, the cycle continues: Capacity and performance data is gathered on the upgraded infrastructure, which can then be analyzed and new baselines established. Thus capacity planning is a continuous process. As workloads change, hardware is added or networks are reinforced, new baselines must be isolated and future needs forecasted with accuracy. The first element of capacity management is visibility of the infrastructure in your environment and knowledge of how the elements are connected together to deliver business services and the associated service levels, said Rob Stroud, an IT Service Management evangelist at CA. The second element is to understand the demand on your environment. Any organization beginning capacity planning activities for the first time faces a daunting prospectthe entire enterprise lies before them. Every process, every resource, every system and every building is a potential target. Getting Started The best approach is to prioritize capacity planning efforts based on mission-critical needs. That means focusing on infrastructure components supporting those applications necessary to business survival first. Typically, this centers around order processing, order fulfillment, manufacturing and customer service, depending on the business. Once priorities have been established, the capacity planner should begin with a resource view to gather data, look for outliers and find out more about them. With that data in hand, the next step is to build profiles for each component or groups of components such as clusters, banks and mirrors. The capacity planner should also dig in to locate repetitive cycles. For example, there might be a spike on server usage every Friday afternoon caused by everyone logging on to check messages and complete tasks before the weekend. Monthly, quarterly and annual processes can also be tracked. Capacity planning efforts can be thwarted by a failure to take these repetitive cycles into account. Further, the capacity planner must determine representative time frames. This is meant to discern usage levels that fit various time frames: How many workstations will be in use at any one time? How will usage patterns shift over time? Similarly with servers, representative time frames must be established to take into account usage and other metrics. Obviously, such tasks require automation. But rolling out performance data capture software across several thousand servers can be a daunting task. Even if agents are used, they still need to be configured in order to customize the data collected and the way it is aggregated for reporting purposes. Further, associating business events to usage can be problematic. Performance data, after all, is of little use if you cant determine the business events associated with the usage. Large organizations with several hundred applications, for example, make this task complex and extensive. Such challenges can be overcome by using installation scripts that can be easily integrated into existing software distribution tools to help automate installation. Centrally based administration can also facilitate configuration by propagating commonly used configurations across large number of servers. For example, operating system component usage may be accounted for in an overhead category and a database management system accounted for in a DBMS category. Facing hundreds of servers supporting vital business functions, capacity management automation has become a must. The sheer volume of information required to do capacity management in todays highly complex IT infrastructures makes automation a necessity, said Ed Holub, an analyst at Gartner. Even with automation in place, however, there still is a lot of effort required by senior IT professionals to effectively manage capacity. In capacity planning, data collection tools are first put in place. This enables organizations to gather performance and capacity data, which can be analyzed to build a baseline view of where the current infrastructure stands. With this in hand, the organization can better understand existing business plans and their potential impact on the existing IT infrastructure. This highlights any shortfalls based upon the prediction of future resource needs. With these results reported to management, the cycle continues: Capacity and performance data is gathered on the upgraded infrastructure, which can then be analyzed and new baselines established. Thus capacity planning is a continuous process. As workloads change, hardware is added or networks are reinforced, new baselines must be isolated and future needs forecasted with accuracy. The first element of capacity management is visibility of the infrastructure in your environment and knowledge of how the elements are connected together to deliver business services and the associated service levels, said Rob Stroud, an IT Service Management evangelist at CA. The second element is to understand the demand on your environment. Any organization beginning capacity planning activities for the first time faces a daunting prospectthe entire enterprise lies before them. Every process, every resource, every system and every building is a potential target. Getting Started The best approach is to prioritize capacity planning efforts based on mission-critical needs. That means focusing on infrastructure components supporting those applications necessary to business survival first. Typically, this centers around order processing, order fulfillment, manufacturing and customer service, depending on the business. Once priorities have been established, the capacity planner should begin with a resource view to gather data, look for outliers and find out more about them. With that data in hand, the next step is to build profiles for each component or groups of components such as clusters, banks and mirrors. The capacity planner should also dig in to locate repetitive cycles. For example, there might be a spike on server usage every Friday afternoon caused by everyone logging on to check messages and complete tasks before the weekend. Monthly, quarterly and annual processes can also be tracked. Capacity planning efforts can be thwarted by a failure to take these repetitive cycles into account. Further, the capacity planner must determine representative time frames. This is meant to discern usage levels that fit various time frames: How many workstations will be in use at any one time? How will usage patterns shift over time? Similarly with servers, representative time frames must be established to take into account usage and other metrics. Obviously, such tasks require automation. But rolling out performance data capture software across several thousand servers can be a daunting task. Even if agents are used, they still need to be configured in order to customize the data collected and the way it is aggregated for reporting purposes. Further, associating business events to usage can be problematic. Performance data, after all, is of little use if you cant determine the business events associated with the usage. Large organizations with several hundred applications, for example, make this task complex and extensive.
Such challenges can be overcome by using installation scripts that can be easily integrated into existing software distribution tools to help automate installation. Centrally based administration can also facilitate configuration by propagating commonly used configurations across large number of servers. For example, operating system component usage may be accounted for in an overhead category and a database management system accounted for in a DBMS category.
The sheer volume of information required to do capacity management in todays highly complex IT infrastructures makes automation a necessity, said Ed Holub, an analyst at Gartner. Even with automation in place, however, there still is a lot of effort required by senior IT professionals to effectively manage capacity. In capacity planning, data collection tools are first put in place. This enables organizations to gather performance and capacity data, which can be analyzed to build a baseline view of where the current infrastructure stands. With this in hand, the organization can better understand existing business plans and their potential impact on the existing IT infrastructure. This highlights any shortfalls based upon the prediction of future resource needs. With these results reported to management, the cycle continues: Capacity and performance data is gathered on the upgraded infrastructure, which can then be analyzed and new baselines established. Thus capacity planning is a continuous process. As workloads change, hardware is added or networks are reinforced, new baselines must be isolated and future needs forecasted with accuracy. The first element of capacity management is visibility of the infrastructure in your environment and knowledge of how the elements are connected together to deliver business services and the associated service levels, said Rob Stroud, an IT Service Management evangelist at CA. The second element is to understand the demand on your environment. Any organization beginning capacity planning activities for the first time faces a daunting prospectthe entire enterprise lies before them. Every process, every resource, every system and every building is a potential target. Getting Started The best approach is to prioritize capacity planning efforts based on mission-critical needs. That means focusing on infrastructure components supporting those applications necessary to business survival first. Typically, this centers around order processing, order fulfillment, manufacturing and customer service, depending on the business. Once priorities have been established, the capacity planner should begin with a resource view to gather data, look for outliers and find out more about them. With that data in hand, the next step is to build profiles for each component or groups of components such as clusters, banks and mirrors. The capacity planner should also dig in to locate repetitive cycles. For example, there might be a spike on server usage every Friday afternoon caused by everyone logging on to check messages and complete tasks before the weekend. Monthly, quarterly and annual processes can also be tracked. Capacity planning efforts can be thwarted by a failure to take these repetitive cycles into account. Further, the capacity planner must determine representative time frames. This is meant to discern usage levels that fit various time frames: How many workstations will be in use at any one time? How will usage patterns shift over time? Similarly with servers, representative time frames must be established to take into account usage and other metrics. Obviously, such tasks require automation. But rolling out performance data capture software across several thousand servers can be a daunting task. Even if agents are used, they still need to be configured in order to customize the data collected and the way it is aggregated for reporting purposes. Further, associating business events to usage can be problematic. Performance data, after all, is of little use if you cant determine the business events associated with the usage. Large organizations with several hundred applications, for example, make this task complex and extensive.
Such challenges can be overcome by using installation scripts that can be easily integrated into existing software distribution tools to help automate installation. Centrally based administration can also facilitate configuration by propagating commonly used configurations across large number of servers. For example, operating system component usage may be accounted for in an overhead category and a database management system accounted for in a DBMS category. The Role of Analytics Analytics and business intelligence (BI) tools play a part in capacity management. Advanced analytics permit you to better monitor infrastructure behavior. For example, you may have a server that operates at 40% capacity. One day the utilization jumps to 60% and stays there. Since your capacity threshold for alerting occurs at 75%, it may be some time before you realize that there might be a problem. In addition, advanced analytics could perform continuous trending functions so when application usage strays from what is expected, the appropriate people are alerted to determine cause and permit corrective activities or drive changes to the capacity plans, said Ronald Potter, an IT Best Practices manager for TeamQuest Corp. Where business metrics are not available, business intelligence tools can help you understand business processes and how they impact infrastructure capacity. By using BI, it is possible to determine counts of business events and associate them to the data contained in the capacity database. Doing so facilitates the ability to communicate infrastructure capacity in business terms. But tools are only part of the solution. As with all ITIL implementations, the capacity management process relies on the right combination of people, process and technology. Thus effective capacity management necessitates working relationships with business units. Changes in business processes, even using the same applications, can dramatically affect system performance. Signing a large new customer can have a similar impact. Without good working relationships with your business customer, you may not discover business changes until after they have happened and your systems are overloaded, said Potter. A good working relationship permits you to run your infrastructure closer to the edge since you have confidence that in most cases you will have enough advance notice to react to business changes. Process, too, is vital. Processes play an essential role in the success of capacity planning. The road map to success is processes which are repeatable and consistent. The results from process efficiency can be significant. In research Ive conducted on behalf of the IT Process Institute, we discovered that high performing IT organizations (which constituted about 13% of our surveyed population) sustain five-times higher server/sys-admin ratios, manage eight-times more projects and six-times as many applications, and implement 14-times as many changes compared to the typical organization, said Gene Kim, CTO of Tripwire. Once you understand the processes and their interactions with other processes, automation is key. Automation enables the implementation of the knowledge developed in the organization and allows for enhanced customer support. CA, for example, offers solutions that automate ITIL, as well as supporting materials such as a series of graphical representations or subway maps that help them no matter where they are in their implementations. Similarly, TeamQuest provides a wealth of tools designed with large-scale installations in mind. Its Enterprise Database facility automatically harvests data from user-defined groups of servers and consolidates it in a single location. This makes it possible to move the work associated with analysis and reporting away from the production servers. With multiple servers data contained in an enterprise database, multi-tier applications can be easily analyzed and reported upon from end-to-end perspective. In addition, TeamQuest View and IT Service Analyzer can assist the capacity management team in proactively identifying problematic applications before they impact production operations. Both products can perform linear trend analyses to ensure current usage trends are in line with long term capacity plans. And Tripwire provides IT with the tools to streamline change by alerting management to unauthorized changes that could impair system availability, or cause other issues. Capacity Management Not Enough IT, then, has to have information from the business regarding forecasted growth so it can translate increases in business volumes into hardware/software resource consumption. It is vital to have well-defined service level agreements between IT and the business, so that just enough capacity can be cost effectively provisioned to meet those agreements. Thats where capacity management comes in. By automating many of the processes and harnessing various tools to add efficiency, capacity planning efforts can be streamlined and simplified. But Holub points out that capacity management cannot operate in isolation within an ITIL framework. Nor should it be done prior to certain other facets of ITIL Capacity management is one of the higher-order ITIL processes, said Holub. Organizations should ensure they have achieved relatively high process maturity in the core service support processes such as change management and configuration management, before attempting to tackle capacity management. |