LCM snapshot creation fails on health-check

It was time to update my Aria Automation certificates. Luckily this is an automated process you can execute through LCM, a process which is way easier than before. But for security reasons I always take a snapshot of my current environment.
However when I wanted to take my snapshot, I got an error during the precheck validations.

Unfortunately I forgot to take a printscreen of the error, but the precheck said that there were issues with the storage.

If you check /var/log/vrlcm/vmware_vrlcm.log you can search for the error.

But when I executed a disk check => vracli disk-mgr
All the disks had enough of free space.

Now to be sure that the error you get is the actual error, there is another command to execute.
When you execute=> /opt/health/run.sh
You will see where the actual error is. Here in my case it was related with memory usage.

And indeed when I executed a check on the memory there was one node that had more than 90% active memory usage.
k8s-metrics

So I rebooted my environment and after it came back online the memory was lowered to HALF of the memory usage it was before.

It took me a while to find this out but hopefully it helps you out sooner.


Leave a comment