An exchange from dm-discuss, an excellent list.
(Original post - not from me, followed by my answer)
I am in search of information specifically related to database "instance" configuration management. Any information/opinions would be great. To be precise, what I mean by "instance" would be the configuration of, say, Oracle 10gr2 or Sybase ASE 15 that is mounted on a host OS (e.g. on HP-UX 11.11). Let me explain: What we are being solicited for by our large user base is management (e.g. large scale global management) of the database software that is mounted on specific types of 'boxes'.
Take a large Wall Street customer of ours. They have HUGE infrastructure and sophisticated CM processes related to OS configuration management of those servers (e.g. managing the HP-UX system variables) and is managed by core sys-ops professionals. But the database type (e.g. Oracle), version (e.g. 10g), patch level (e.g. R2.xxx), space allocation, etc, managed on those servers they feel is better assessed and globally managed in 100's of not 1000's of places by DBAs as any variable adjustments to the instance configuration could directly effect application performance....and is obviously a core SLA of a production DBA. They feel that the specialization of this database knowledge is beyond sys-ops people and core knowledge must come from the DBA ranks.
So, in sum, this new 'layer' of configuration management of the database OS is one level of abstraction higher that the database 'schema' (read: DDL and object management of the application being installed within the database software) and one layer of abstraction lower than the typical OS configuration management on which the database software (e.g. the Oracle host) is mounted.
Soooo, with all this background said, does anyone have any resources I may be pointed to or personal opinions on how your company may be managing this specifically?
(My answer) -
The relationship of production DBAs and their DBMSes to hosting concerns is indeed a complex and interesting one. I've been sorting some related things out for a comparable organization.
The first priority is a more precise understanding of what we mean by "configuration management." I break it down into three major classes:
- Software configuration management
- Element configuration management
- Enterprise configuration management
The discussion in this thread has focused on database configuration management as an aspect of software configuration management (SCM). SCM is characterized by management of text-based assets (source code, SQL) and binary assets, with versioning, branching, and merging capabilities. Text-based management includes differencing ("what changed"). Examples of SCM tools are Endeavor (mainframe); CVS, Subversion, and Clearcase (distributed); and Aldon (AS/400).
Element configuration management I define as the most detailed and technical non-programming work done in IT. It includes provisioning, parameter management, and drift control. It applies to physical devices, operating systems, and middleware (including DBMSes), but typically not to functional applications.
Provisioning is the initial task of building a configuration item as a service component, e.g. installing an OS, utilities, and DBMS engine to a certain patch level on a "bare metal" box.
Parameter management is the (often extremely detailed) work in configuring and tuning sophisticated technical systems. A robust provisioning architecture will allow the repeatable re-creation of particular "tunings," e.g. Oracle 10 on a Linux box with a particular failover architecture. The issue your client has identified is that their server build engineers are not skilled in DBMS parameter management, and (as I've seen elsewhere) the server engineers are the ones more famliar with the enterprise change process and probably operating the provisioning and discovery/drift control tools.
Discovery/drift control is the discovery and baselining of particular parameters of interest and automating their management so that changes result in notifications (often integrated with enterprise monitoring and the enterprise change process) and possibly automated rollback (e.g. if someone tweaks the Oracle optimizer_mode setting, or the Apache port). The ability to compare a given server's build against a 'gold standard' fits here.
COBIT uses "configuration management" more in this sense. Opsware and Bladelogic are well known provisioning tools, while Tripwire and Solidcore are examples of drift control tools.
Element config is difficult in that there are so many parameters to manage. Careful consideration needs to be given to which parameters must remain under change control, versus those that can be allowed to change at the discretion of the engineering team. Naive approaches to element config over-control every possible parameter, resulting in non-agile operations and frustrated engineers.
Enterprise configuration management is the management of large grained IT configuration items (e.g. Servers, Applications, Datastores) and their dependencies. The dependency management aspect in particular makes enterprise config overlap with metadata management and enterprise architecture. Application mapping tools fit here. ITIL uses "configuration management" more in this sense. BMC, CA, HP, and IBM all have fast-evolving offerings here that sooner or later will converge with metadata management - it's inevitable.
I think what you are talking about is element configuration management. My perception is that the core provisioning and drift management vendors are a little behind in supporting the DBMS world, but are aware of the needs you are citing below and moving in this direction. One interesting aspect of DBMSes is that their element configuration management is often conveniently represented as scripts (this is also true of other classes of element managers). In this sense both SCM and element config come into play. However, the need for operational monitoring prevents the use of simple SCM approaches - to detect drift, one would have to manually dump out the SQL/DDL description of the database's configuration and compare it to the previous version offline using an SCM diff capability. This is not possible at scale; what is needed instead is continual automated monitoring of the DBMS engine for changes.
So in sum:
- Look to the provisioning vendors and find out where they are at w/respect to DBMSes and DBMS element managers. Look for the acquisition of DBMS management specialty vendors by general management framework/provisioning vendors.
- Carefully define the parameters to be "under management" and what that means with respect to the enterprise change and incident management processes. This is where the DBAs would partner with the sys-ops people. Both have something to bring to the table and neither can solve it on their own.