UsMan's WoRkSpAce

Thursday, August 24, 2006

Monitoring SUN StorEdge 3000 storage arrays

SUN StorEdge storage arrays 3000 come bundled with two administrative and monitoring software, namely, SUN StorEdge Configuration Service and SUN StorEdge Diagnostic Reporter. Both of these are included in SUNWsscs package. Configuration service has two components, an agent and a console. Agent should run on all managed servers or atleast on all servers which are connected to distinct arrays. Console is a Java app which connects to the agent and helps to monitor, manage and configure the arrays. Alerts are generated via email and SNMP. However they only work, if console is constantly running. If it is not feasible, StorEdge Diagnostic Reporter must be enabled to work with configuration service.

Configuration service agent does report events to syslog (/var/adm/messages on Solaris) even if console isn't running. Configuration service can be started from the script /etc/init.d/ssagent, which starts two processes namely, /usr/sbin/ssmon and /usr/sbin/ssserver. These should be running in order for console GUI to connect to the agent. Agent generates events and sends to console, if it is running, which saves them to a file, /opt/SUNWsscs/sscsconsole/eventlog.txt. They can be viewed from the console GUI. Agent runs on TCP port 1270 and listens to connection from console and diagnostic report daemon.

Configuration service creates three OS accounts, ssmon (for monitoring), ssadmin (for administration) and ssconfig (for configuration). They can be managed using standard OS utilities. Console require these accounts to manage the arrays.

Diagnostic reporter has three components, Agent runs in the background. Should be running on the same machine as configuration service. Config tool (UI) is used to configure the type of alerts to be emailed. Mail receiver tool displays the message that are received. It is a POP client program. Agent listens on TCP port 7409 and accepts connection from config tool. Agent process name is ssdgrptd and it is started by /etc/init.d/ssdgrptd script. The config tool and daemon requires ssconfig user password to connect to configuration service agent. Diagnostic reporter can generate reports of array configuration, provided configuration service is configured so. Diagnostic reporter service (ssdgrptd script) should be restarted, if configuration service is restarted.

SUNWsscs package also includes StorEdge CLI. It includes a command /opt/SUNWsscs/sbin/sccli to monitor, manage and configure array from command line.

Wednesday, August 23, 2006

Veritas Cluster Software

VCS resources are on-off, on-only and persistent. In case of failover, resource are not switched individually but the entire service group. Critical attribute of a resource defines whether service group fails over when the resource faults. When a resource faults, VCS takes action to clean up the resource.

VCS service groups are failover, parallel and hybrid. Clusterservice group is a special service group for resources required by VCS. Freezing a service group prevents VCS from taking action when a service group or system faults.

Agents are VCS processes that manage resources of pre-defined types according to commands received from VCS engine (HAD). Agents can be created in C++, Perl and shell scripts. Agents are bundles, enterprise or custom. hashadow monitors and restarts HAD, if required. HAD runs at high priority of 0 and nice level of real time. Agent has entrypoints and agent framework. HAD core dumps are written to /var/VRTSvcs/diag/had directory.

Cluster communications use LLT and GAB. LLT is a high performance and low latency replacement of IP stack. GAB runs on top of LLT and manage cluster membership and communications.

By default, VCS monitors resource every 60 seconds. Default no of threads per agent of a resource type is 10. MonitorInterval, OfflineMonitorInterval, MonitorTimoue, OnlineTimeout, ToleranceLimit and OfflineTimeout.

Cluster states at boot time are LOCAL_BUILD, RUNNING,