Page 1 of 1

Backup failed- cannot grab lock to create snapshot

Posted: Fri Jun 13, 2014 3:10 pm
by mannb
Hello,

I am currently running a 3 node VM cluster in VMware Workstation, and I wish to perform a backup of my current database. When I attempt to run the vbr.py script given a configuration file I created, I receive quite a lengthy error. Part of it is saying that an ssh command failed with non-zero exit status, but at the end it says: Cannot grab lock to create snapshot:

Code: Select all

27966: vbr client subproc on 192.168.40.131 terminates with returncode 1. Details in vbr_v_bdde_vm_node0002_client.log on that host. 
Error msg: Host key verification failed.
Traceback (most recent call last):
  File "/tmp/vbr/vbr.py", line 2702, in work
    remoteClient(args[0], args[1], args[2], args[3], args[4], args[5], args[6] == 'True')
  File "/tmp/vbr/vbr.py", line 890, in remoteClient
    ssList = subprocess.check_output(g["sshBackup"] + [sHost, cmd])
  File "/opt/vertica/oss/python/lib/python2.7/subprocess.py", line 537, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['ssh', '-x', '192.168.40.131', 'ls -1 /data/backups/v_bdde_vm_node0002']' returned non-zero exit status 255

Child processes terminated abnormally.
backup failed!
cleaning up...
27968: vbr client subproc on 192.168.40.132 terminates with returncode 1. Details in vbr_v_bdde_vm_node0003_client.log on that host. 
Error msg: Host key verification failed.
Traceback (most recent call last):
  File "/tmp/vbr/vbr.py", line 2702, in work
    remoteClient(args[0], args[1], args[2], args[3], args[4], args[5], args[6] == 'True')
  File "/tmp/vbr/vbr.py", line 890, in remoteClient
    ssList = subprocess.check_output(g["sshBackup"] + [sHost, cmd])
  File "/opt/vertica/oss/python/lib/python2.7/subprocess.py", line 537, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['ssh', '-x', '192.168.40.132', 'ls -1 /data/backups/v_bdde_vm_node0003']' returned non-zero exit status 255

27965: vbr client subproc on 192.168.40.130 terminates with returncode 255. Details in vbr_v_bdde_vm_node0001_client.log on that host. 
Error msg: Killed by signal 2.

Retrying... #1
ERROR 4153:  Node: v_bdde_vm_node0001: Cannot grab lock to create snapshot 'bdde_vm_fullbackup'. It might be used by others
When communicating with vertica, the process failed with code 1
backup failed!
Retrying... #2
ERROR 4153:  Node: v_bdde_vm_node0001: Cannot grab lock to create snapshot 'bdde_vm_fullbackup'. It might be used by others
When communicating with vertica, the process failed with code 1
backup failed!
I looked up this issue by searching the error number, ERROR 4153, and found someone having a similar issue, but I only found one suggestion, which was to re-create the ssh keys under /root/.ssh. I actually hadn't created any, so I ran ssh-keygen and copied each key from each VM to each VM in the cluster by running ssh-copy-id -i ~/.ssh/id_rsa.pub <ip_address>, and the backup still failed.

I deleted the authorized keys file under /root/.ssh, and re-ran ssh-copy-id ~/.ssh/id_rsa.pub <ip_address>, and I still receive the same error. I have also confirmed that passwordless SSH works from each VM for both the dbadmin and root user.

This is my configuration file:

Code: Select all

[Misc]
snapshotName = bdde_vm_fullbackup
verticaConfig = True
restorePointLimit = 2

[Database]
dbName = bdde_vm
dbUser = dbadmin
dbPassword = bdde_vm
[Transmission]

[Mapping]
v_bdde_vm_node0001 = 192.168.40.130:/data/backups
v_bdde_vm_node0002 = 192.168.40.131:/data/backups
v_bdde_vm_node0003 = 192.168.40.132:/data/backups

I am suspicious that the backup is failing because these are virtual machines, but I'm not sure of this. Any help or suggestions to resolve this would be greatly appreciated.

Re: Backup failed- cannot grab lock to create snapshot

Posted: Tue Jul 08, 2014 6:58 pm
by rfamilypa
Hi there, I was wondering if you ever figured this out....we're running into the same exact error.

Thanks!

Re: Backup failed- cannot grab lock to create snapshot

Posted: Mon Jul 14, 2014 3:21 pm
by nonodename
The only time I have seen this error message on backups has been when my backup destination is full! Have you checked this?