ASM Metadata (Disk headers, File Directory, Alias Directory) Disk Group health Process responsiveness
Here’s a structured feature implementation for — suitable for a monitoring or alerting system.
# Locate the ASM alert log using ADRCI adrci adrci> set home +asm adrci> show alert -tail 100 Use code with caution.
The ASM Health Checker is part of the Oracle Check Framework. It runs periodic checks on the ASM instance, disk groups, and metadata to ensure everything is operating within healthy parameters. asm health checker found 1 new failures
Note: Always back up your metadata and ensure you have a valid backup before running automated repair scripts on production storage. 5. Clearing the Alert
Introduction The terse message “asm health checker found 1 new failures” appears straightforward but carries significant operational weight: it signals that an ASM (Automatic Storage Management, or a similarly named subsystem) health-check routine has detected a failure. Whether that ASM is Oracle ASM, a cloud Autoscaling/Service Mesh monitor, or a custom “Application Service Monitor,” the phrasing implies an automated health-scan discovered one additional fault relative to its prior baseline. This essay examines the message’s possible meanings, root causes, investigative approach, risk implications, and systematic remediation and prevention strategies. The aim is to move from alarm to actionable resolution, and from reactive fixes to durable system hardening.
Look out for disks showing a header_status of UNKNOWN , CANDIDATE , or FORMER when they should be MEMBER , or a mode_status marked as OFFLINE . ASM Metadata (Disk headers, File Directory, Alias Directory)
In RAC, if the Cluster Synchronization Service (CSS) heartbeat is delayed, the ASM health checker may interpret this as an ASM instance failure.
# Check if device exists ls -l /dev/oracleasm/disks/ (if using ASMLIB) or ls -l /dev/mapper/ | grep asm
asmcmd checkset -g DATA
When you encounter the "ASM Health Checker found 1 new failures" message, follow this structured approach to identify and resolve the underlying issue. The process involves confirming the failure, pinpointing its source, performing an initial repair, and verifying the result.
If you manage Oracle Grid Infrastructure (GI) or a standalone Automatic Storage Management (ASM) instance, one notification can send a chill down your spine:
Log into your ASM instance via SQL*Plus as SYSASM to assess the cluster-wide operational health of your storage: It runs periodic checks on the ASM instance,
Drop unneeded files, purge the ASM volume recycle bin, or immediately add a new LUN to the disk group: ALTER DISKGROUP ADD DISK '/dev/mapper/new_lun1'; Scenario C: ASM Parameter Mismatch Across Nodes