esiksha.com - Computers - M.C.S.E. - Elective

eSiksha

Login Password Sign Up Forgot Password

Saturday, July 05, 2025


	Site Search

M.C.S.E.

Home
Core Papers
70-210
70-215
70-216
70-217
Core-Electives
70-219
70-220
70-221
Elective Papers
70-219
70-220
70-221
70-222
70-223
70-224
70-228
70-229
Non-Retiring NT4
Electives
70-019
70-028
70-029
70-056
70-080
70-081
70-086
70-088
Upgrading NT4 to
2000
70-240

COMPUTERS

Home
MCSD Cert.
Cisco Cert.
Overview
The Work
Areas of Work
Eligibility
Career Prospects
Remuneration

T
R
A
C
K
S

MBA
Engineering
Medical
Humanities
Sciences
Computers
Govt. Exams
Commerce
School/+2

70-223-Windows® 2000 Clustering Service

Before you start

This study guide provides you with information on the many different aspects of "W2K Clustering Service". It serves as an introductory guideline on Clustering.

Before you proceed with this subject, please read through the study material for the following and make sure you are 100% comfortable with the W2K architecture

70-215
70-216
70-217

It is not easy for people to get their hands on clustering equipments. Please visit MS web site and get the latest list of equipment that support W2K Clustering, and see if you can have chance to really use it for awhile.

Clustering is a broad and advanced topic. Do NOT rely solely on this study notes for the exam. By all means read more than one book on the subject and make sure you understand the material well enough so that you could be ready for the questions. There is no quick way to succeed for this topic. Ideally you must work things out and gain experience before even trying to sign up for the exam.

The reasons we deploy cluster are: high availability, scalability, and manageability. While the group of clustered nodes must be locally connected, administrators can remotely control the cluster.

Terminology

Cluster service is the Windows 2000 name for the original Microsoft Cluster Server (MSCS) in Windows NT Server 4.0, Enterprise Edition.
Individual computers are referred to as nodes.
Cluster service is the collection of components on each node that perform cluster-specific activity
Resources are the hardware and software components within the cluster. Resource DLLs define resource abstractions, communication interfaces, and management operations. A resource is online when it is available and providing its service. Cluster resources can include physical hardware devices such as disk drives and network cards, and logical items such as IP addresses, and applications. Each node will have its own local resources, and at the same time the cluster also has common resources, mostly being the common data storage array and private cluster network, which are accessible by each node in the cluster.
Quorum resource is a physical disk in the common cluster disk array that must be present for node operations to occur.
Resource group is a collection of resources managed as a single, logical unit. When a service is performed on a resource group, the operation affects all individual resources within the group. A resource group can be owned by only one node at a time. Also, individual resources within a group must exist on the node that currently owns the group. Keep in mind that at any given instance, different servers in the cluster cannot own different resources in the same resource group. In case of failure, resource groups can be failed over or moved as atomic units from the failed node to another available node.
Cluster-wide policy - each resource group has an associated cluster-wide policy that specifies which server the group prefers to run on and which server the group should move to in case of a failure.
Resource dependencies - each resource in a group may depend on other resources in the cluster. This is expressed in terms of dependencies - relationships between resources that indicate which resources need to be started and available before another resource can be started. They are identified using Cluster service resource group properties and enable Cluster service to control the order in which resources are brought on and off line. However, the scope of any identified dependency is limited to resources within the same resource group only.
Node preference list is a resource group property used to assign a resource group to a node. In clusters with more than two nodes, the node preference list for each resource group can specify a preferred server plus one or more prioritized alternatives to enable cascading failover - a resource group may survive multiple server failures, as each time it will be failing over to the next server on its node preference list. As the Cluster administrators, you can set up different node preference lists for each resource group on a server.
Shared-nothing model refers to how servers in a cluster manage and use local and common cluster devices and resources. Each server owns and manages its local devices, but for devices common to the cluster, they are selectively owned and managed by a single server at any given time.
Virtual Servers hide the complexity of clustering operations. To users and clients, connecting to an application running as a clustered virtual server appears to be the same process as connecting to a single server. User will not know which node is actually hosting the virtual server. Cluster service manages the virtual server as a resource group, with each containing two resources: an IP address and a network name. To the client, it is simply a view of individual network names and IP addresses.
Devices - to set up Clustering service, external storage devices common to the cluster require SCSI devices and support standard PCI-based SCSI connections as well as SCSI over fiber channel and SCSI bus with multiple initiators. Windows 2000 Datacenter Server supports four-node clusters and require device connections using Fiber Channel. The main point is that the connection must be real fast and reliable between the nodes and the shared devices.
Two Types of Clusters: Cluster service is intended to provide failover support for applications. On the other hand, Network Load Balancing service load balances incoming IP traffic across clusters of up to 32 nodes to enhance both the availability and scalability of Internet server-based programs. You can combine the two. Typically this involves deploying Network Load Balancing across a front-end Web server farm, and clustering back-end line-of-business applications such as databases with Cluster service.
NLB - Network Load Balancing lets system administrators build clusters with up to 32 hosts among which it load-balances incoming client requests. The setup is completely transparent, meaning that clients are unable to distinguish the cluster from a single server, and programs are not aware that they are running in a cluster setup. Control can be defined on a port-by-port level, and hosts can be added to or removed from a cluster without interrupting services. In a NLB setup, host failures are detected within five seconds, and recovery is accomplished within ten seconds - workload is automatically and transparently redistributed among the cluster hosts.
Performance measurements have shown that Network Load Balancing's efficient software implementation imposes very low overhead on network traffic-handling and delivers excellent performance-scaling limited only by subnet bandwidth. Network Load Balancing has demonstrated more than 200 Mbps throughput in realistic customer scenarios handling e-commerce loads of more than 800 million requests per day.

Components

Checkpoint Manager is for saving application registry keys in a cluster directory that is stored on the quorum resource. Communications Manager manages communications between cluster nodes. Configuration Database Manager maintains cluster configuration information. Event Processor receives event messages from cluster resources and requests from applications to enumerate cluster objects. Event Log Manager replicates event log entries from one node to all other nodes in the cluster. Failover Manager performs resource management and initiates appropriate actions. Global Update Manager provides global update service used by cluster components. Log Manager writes changes to recovery logs stored on the quorum resource. Membership Manager manages cluster membership. It also monitors the health of other nodes in the cluster. Node Manager assigns resource group ownership to nodes based on 2 factors: the group preference lists and node availability. Object Manager manages all the cluster service objects. Resource Monitors monitor the health of each cluster resource using callbacks to resources DLLs. Resource Monitors provide the communication interface between resource DLLs and the Cluster service. When the Cluster service needs to obtain data from a resource, the Resource Monitor receives the request and forwards it to the appropriate resource DLL, and vice versa. Keep in mind that it runs in a process separate from the Cluster service to protect the Cluster service from resource failures.

The Node Manager runs on each node. It maintains a local list of nodes that belong to the cluster. Periodically, it sends messages heartbeats to its counterparts running on other nodes to detect node failures. If one node detects a communication failure with another cluster node, it broadcasts a message to the entire cluster causing all members to verify their view of the current cluster membership for a regroup event. No write operations to any disk devices common to all nodes in the cluster is allowed until the membership has stabilized. The node not responding is removed from the cluster and its active resource groups are moved to another active node. To select the node to which a resource group should be moved in a set up with more than 2 nodes, Node Manager identifies the node on which a resource group prefers to run and the possible nodes that may own individual resources.

The Configuration Database Manager implements functions needed to maintain the cluster configuration database with information about all of the physical and logical entities in a cluster. Each Configuration Database Manager running on each node cooperates to maintain consistent configuration information across the cluster with the one-phase commit method to ensure the consistency of the copies of the configuration database on all nodes. Keep in mind that Cluster-aware applications use the cluster configuration database to store recovery information. For applications that are not cluster-aware, information is stored in the local server registry. The Log Manager together with the Checkpoint Manager ensures that the recovery log on the quorum resource contains the most recent configuration data and change checkpoints. These are done to ensure that the Cluster service can recover from a resource failure.

Supported Services

Services supported by clustering are determined by the availability of the corresponding Resource DLLs. Resource DLLs provided with Windows NT Server 4.0, Enterprise Edition enable Cluster service to support File and print shares, Generic services or applications, Physical disks, Microsoft Distributed Transaction Coordinator, Internet Information Services, Message Queuing, Network addressing and naming. With Windows 2000 Advanced Server and Windows 2000 Datacenter Server, we have support for the following additional services: Distributed File System, Dynamic Host Configuration Protocol. Network News Transfer Protocol, Simple Message Transfer Protocol and Windows Internet Service (WINS). In addition, cluster-aware applications that provide their own resource DLLS can enable customized advanced scalability and failover functions.

Failover and Failback

Failover can occur automatically when a failure occurs, or when you manually trigger it. Resources are gracefully shut down for a manual failover, but are forcefully shut down in the failure case. Automatic failover requires determining what groups were running on the failed node and which nodes should take ownership, meaning all nodes in the cluster need to negotiate among themselves for ownership based on node capabilities, current load, application feedback, or the node preference list. Cascading failover has the assumption that every other server in the cluster has some excess capacity to absorb a portion of any other failed server's workload.

When a previously down node comes back online, the Failover Manager can decide to move some resource groups back to the recovered node via failback. To allow this to happen the properties of a resource group must have a preferred owner defined in order to failback to a recovered or restarted node. Resource groups which is the preferred owner will be moved from the current owner to the recovered or restarted node. To avoid causing extra troubles, cluster service provides protection against failback of resource groups at peak processing times, or to nodes that have not been correctly recovered or restarted.

For failure detection, you want to know the difference between the following two mechanisms

Heartbeat is for detecting node failures.
Resource Monitor and resource DLLs is for detecting resource failures via polling.

Cluster Server Installation and Operation

The Software Requirements for installing Cluster Services include

Microsoft Windows 2000 Advanced Server or Windows 2000 Datacenter Server
DNS, WINS, or HOSTS … naming methods. DNS is preferable.
Terminal Server is optional. It allows remote cluster administration.

For the Hardware, the node must meet the hardware requirements for Windows 2000 Advanced Server or Windows 2000 Datacenter Server. Also, the cluster hardware must be on the Cluster Service Hardware Compatibility List.

For the 2 HCL-approved computers, each must have a boot disk with Windows 2000 Advanced Server or Windows 2000 Datacenter Server installed. The boot disk cannot be on the shared storage bus though. Then we need a separate PCI storage host adapter using SCSI or Fiber Channel for the shared disks. Regarding the shared disk, we need an HCL-approved external disk storage unit that connects to all computers. RAID is not a must, but is recommended. All shared disks must be configured as basic disks, and that all partitions on the disks must be formatted as NTFS.

It is very important that all hardware should be completely identical for all nodes, so that configuration is much easier.

Network Requirements include

A unique NetBIOS cluster name.
5unique, static IP addresses: 2for the network adapters on the private network, 2 for the network adapters on the public network, and 1 for the cluster itself.
A domain user account for Cluster service. Keep in mind that all nodes must be members of the same domain.
Each node should have two network adapters, so that 1 can be used for connection to the public network and the other for the node-to-node private cluster network.

In order to configure the Cluster service on a Windows 2000-based server, your account must have administrative permissions on each node. Also, all nodes must be member servers, or all nodes must be domain controllers within the same domain, meaning a mix of domain controllers and member servers in a cluster is NOT ok.

During installation of Cluster service on the first node, all other nodes must be offline, and that all shared storage devices should be powered up. Initial cluster configuration information will need to be supplied using the Cluster Service Configuration Wizard. Cluster service files are located on the Windows 2000 Advanced Server or Windows 2000 Datacenter Server CD-ROM's \i386 directory. You may install using the CD or over the network.

After setting up the first computer, add the common data storage devices that will be available to all members of the cluster. This establishes the new cluster with a single node. Then you run the installation utility on each additional computer that will be a member in the cluster. As each new node is added, it automatically receives a copy of the existing cluster database from the original member of the cluster.

During setup, the quorum resource acts as the role of tiebreaker when a cluster is formed, or when network connections between nodes fail. The quorum resource on the common cluster device stores the most current version of the configuration database in the form of recovery logs that contain node-independent cluster configuration and state data. And during cluster operations, the Cluster service uses the quorum recovery logs to guarantee that only one set of active, communicating nodes is allowed to form a cluster, to enable a node to form a cluster only if it can gain control of the quorum resource, and to allow a node to join or remain in an existing cluster only if it can communicate with the node that controls the quorum resource.

When a cluster is formed, each node may be in one of the three distinct states recorded by the Event Processor and replicated by the Event log Manager to other clusters in the node

Offline
Online
Paused

To join an existing cluster, a server must have the Cluster service running and must successfully locate another node in the cluster via a discovery process. After locating another cluster node, the joining server must be authenticated and receive a replicated copy of the cluster configuration database.

Note that Cluster service of Windows 2000 supports rolling operating system upgrades from Windows NT Server 4.0 Enterprise Edition clusters deployed with Service Pack 4 or higher. This provides users with a totally transparent upgrade.

A node can leave a cluster when it shuts down, when the cluster service is stopped, or when it fails. In a planned shutdown, the node sends a ClusterExit message to all other members in the cluster. Since the remaining nodes received the exit message, they do not need to perform the regroup process. When a node is evicted unplanned, the node status is changed to evicted.

Cluster Administrator is a graphical administrator's tool that enables performing maintenance, monitoring, and failover administration. Additionally, Cluster service includes an automation interface for creating custom scripting tools for administering cluster resources, nodes, and the cluster itself.

Cluster service runs in the context of a Windows-based domain security policy, meaning if the Cluster service does not have access to a domain controller, it cannot form a cluster. Domain controllers are replicated externally to the cluster, so the Cluster service must depend upon the network for accessing the replicas for authentication. This makes the network become a source of failures. To work around this, you need to make every node to become its own authentication authority for the domain. One way is to create a new domain that encompasses just the cluster itself and exists only to provide authentication and authorization for the Cluster service and any other installed services - we call it a domainlet. This domainlet is small, lightweight, and contains no user accounts and no global catalog servers.

The domainlet contains the well-known policies and groups defined for every domain, including Administrators, Domain Administrators, and the service accounts required by the clusters it supports, and nothing else. Since every cluster node holds a replica of the domainlet, a cluster will never generate authentication traffic.

It is very important that you enable logon without a global catalog by defining the registry key as follows on each domain controller:

HKLM\SYSTEM\CurrentControlSet\Control\Lsa\IgnoreGCFailures

You should remove the global catalog, if present, from the domain controllers in the domainlet.

Maintenance

Most maintenance operations within a cluster may be performed with one or more nodes online without taking the entire cluster offline.

Service packs may normally be installed on one node at a time and tested before you move resources to the node, so that if something goes wrong during the update to one node, the other node is still untouched and continuing to make resources available. But in any case, to avoid potential issues or compatibility problems with other applications, go ahead and check the Microsoft Knowledge Base for articles that may apply before proceeding.

Adapter replacement may be performed after moving resources and groups to the other node. Make sure the new adapter configuration for TCP/IP exactly matches that of the old adapter. If you are replacing a SCSI adapter and using Y cables with external termination, you may disconnect the SCSI adapter without affecting the remaining cluster node. For Shared Disk Subsystem Replacement, unfortunately, you will most likely have to shut down the cluster.

Note that cluster configuration is not stored on the emergency repair disk. The service and driver information for the Cluster Service is stored in the system registry. The configuration for cluster resources and groups is stored in the cluster registry hive. Backup the registry to preserve these important settings. For example, you may use the following command to back up the cluster registry

regback filename machinecluster

You want to make sure that the following are NOT done to the cluster

create software fault tolerant sets with shared disks as members.
add resources to the cluster group.
change computer names of either node.
use WINS static entries for cluster nodes or cluster addresses.
configure WINS or default gateway addresses for the private interconnect.
configure cluster resources to use unsupported network protocols or related network services. IP is the only supported protocol in a Cluster.

Application Deployment

Cluster-Aware Applications are applications with the following characteristics

uses TCP/IP as a network protocol.
maintains data in a configurable location.
supports transaction processing.

The two types of cluster-aware applications are applications that are managed as highly available cluster resources by a custom resource type, or applications that interact with the cluster but are not cluster resources. Note that Cluster Administrator itself is an example of such an application.

On the contrary, a cluster-unaware application is distinguished by the following features

does not use the Server Cluster API.
managed as a Generic Application resource type or Generic Service resource type.

Note that a cluster-unaware application can be made cluster-aware by creating resource types to manage the application, as a custom resource type can provide the initialization, cleanup, and management routines specific to the needs of the application. If everything works fine, you are NOT required to make the application cluster-aware.

Application Installation

Before proceeding to application installation, you should first determine the application's resource dependency relationships between resources in the same resource group. You should know that all resources that you associate with an application must be in the same group as the application, meaning if there are multiple applications or instances sharing a resource, all of them must be in the same group. And when you want to run the same application on both servers, you need to define it's resources on both disks. If it will use any Cluster Group resources, you'll need to add it's resources to the Cluster Group.

To install a typical application

1. Create a resource group for the application.

2. Bring the group online on one server.

3. Install the application on the first server, and configure the application to use the cluster storage.

4. Define the application services as cluster resources.

5. Move the group to the other server, install the application there and configure the application to use the cluster storage as well.

6. Confirm that the application will fail over. You may try to manually produce emergency situation to simulate server shutdown or server failure.

ADDITIONAL READING LISTS

Security for Sharing Resources in Cluster

Basically, the security consideration for sharing resources in a Cluster setup is similar to that of in a general environment. It is always the recommendation that rights should not be granted to a local group for a directory hosted on the shared drive. Also keep in mind that the cluster service account requires at least NTFS read privileges to the directory to properly create the share.

Also note that in an active/active cluster configuration, two nodes can own a shared disk independently of each other.

Reparse Points

The technologies that make use of reparse points include Directory Junctions, Volume Mount Points (also known as mounted drive), Removable Storage Service RSS and Remote Installation Services RIS. What is reparse point? With this you can surpass the 26 drive letter limitation and graft a target folder onto another NTFS folder, much like mounting a volume onto an NTFS junction point.

Print Spooling

It used to be very troublesome when setting up NT4 cluster to host the print spooler. W2K has improvement towards this task. You can use Cluster Server to create and host print server function.

Disk Replacement for Cluster

When you want to replace cluster disks, there are usually two variations:

The disk to be replaced is online.
The disk to be replaced has failed.

WINS and DHCP

You may want to cluster WINS and DHCP to guarantee their availability. Basically you can achieve this in the following ways:

You can install the server in any of the following ways:

Install Windows 2000 without initially installing the Cluster service, WINS, or DHCP. Add WINS or DHCP and the Cluster service in any order later.

Install Windows 2000 with the Cluster service first, then add DHCP or WINS at a later time.

Install Windows 2000 with WINS/DHCP or both first, and then install the Cluster service at a later time.

Note that you should install the WINS or DHCP service on each node in the cluster. You will also need to configure a cluster resource afterwards.