Backup to standby cluster

Moderator: NorbertKrupa

Post Reply
binface
Newbie
Newbie
Posts: 13
Joined: Fri Jun 15, 2012 2:40 pm

Backup to standby cluster

Post by binface » Wed Sep 12, 2012 4:31 pm

Hi,

I have successfully performed a backup locally on my live cluster by creating a snapshot and subsequent incrementals and this works fine.

I now want to back up the same database to a standby cluster in my secondary data centre so I have created this ini file :

--------------------------------------------------
[Misc]
snapshotName = backup_snapshot
verticaConfig = False
restorePointLimit = 1

[Database]
dbName = DataStore
dbUser = vertica
dbPassword = xxxxxxx

[Transmission]

[Mapping0]
dbNode = v_datastore_node0001
backupHost = livevertica005-priv
backupDir = /vert_data

[Mapping1]
dbNode = v_datastore_node0002
backupHost = livevertica006-priv
backupDir = /vert_data

[Mapping2]
dbNode = v_datastore_node0003
backupHost = livevertica007-priv
backupDir = /vert_data

[Mapping3]
dbNode = v_datastore_node0004
backupHost = livevertica008-priv
backupDir = /vert_data
--------------------------------------------------
However I get the following error when running :

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[LIVE.DC1][vertica@livevertica001 /opt/vertica/config]$ /opt/vertica/bin/vbr.py -t backup --config-file /opt/vertiica/config/backup_snapshot.ini
Preparing...
Found Database port: 5433
Copying...
29117: vbr client subproc on 172.24.2.153 terminates with returncode 1. Details in vbr_v_datastore_node0004_client.log on that host.
Error msg: Host key verification failed.
Traceback (most recent call last):
File "/tmp/vbr/vbr.py", line 2215, in work
remoteClient(args[0], args[1], args[2], args[3], args[4], args[5], args[6] == 'True')
File "/tmp/vbr/vbr.py", line 844, in remoteClient
ssList = subprocess.check_output(['ssh', '-x', sHost, cmd])
File "/opt/vertica/oss/python/lib/python2.7/subprocess.py", line 537, in check_output
raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['ssh', '-x', '172.24.6.153', 'ls -1 /vert_data/v_datastore_node0004']' returned non-zero exit status 255

29114: vbr client subproc on 172.24.2.151 terminates with returncode 1. Details in vbr_v_datastore_node0002_client.log on that host.
Error msg: Host key verification failed.
Traceback (most recent call last):
File "/tmp/vbr/vbr.py", line 2215, in work
remoteClient(args[0], args[1], args[2], args[3], args[4], args[5], args[6] == 'True')
File "/tmp/vbr/vbr.py", line 844, in remoteClient
ssList = subprocess.check_output(['ssh', '-x', sHost, cmd])
File "/opt/vertica/oss/python/lib/python2.7/subprocess.py", line 537, in check_output
raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['ssh', '-x', '172.24.6.151', 'ls -1 /vert_data/v_datastore_node0002']' returned non-zero exit status 255

Child processes terminated abnormally.
backup failed!
cleaning up...
29115: vbr client subproc on 172.24.2.152 terminates with returncode 255. Details in vbr_v_datastore_node0003_client.log on that host.
Error msg: Killed by signal 2.

Retrying... #1
ERROR 4153: Node: v_datastore_node0001: Cannot grab lock to create snapshot 'backup_snapshot'. It might be used by others
When communicating with vertica, the process failed with code 1
backup failed!
Retrying... #2
ERROR 4153: Node: v_datastore_node0001: Cannot grab lock to create snapshot 'backup_snapshot'. It might be used by others
When communicating with vertica, the process failed with code 1
backup failed!
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I have set up passwordless ssh between all nodes and this works but the backup will not.

Any ideas?

Cheers

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Backup to standby cluster

Post by id10t » Wed Sep 12, 2012 5:58 pm

Hi!

Error 255 means: "Could not resolve hostname <hostname>: Name or service not known".
In your case hostname is IPs : 172.24.6.151, 172.24.6.152, 172.24.6.153

1. Check that ssh is paswordless for IPs! (and not for names : livevertica00{5,6,7,8}-priv)

2. If it is - passwordless, provide next output:

Code: Select all

cat /opt/vertica/config/admintools.conf
from all hosts:

Code: Select all

cat /etc/hosts

chad
Newbie
Newbie
Posts: 10
Joined: Wed Oct 16, 2013 2:00 am

Re: Backup to standby cluster

Post by chad » Mon Feb 24, 2014 11:25 am

After spending a few hours with a similar ERROR 255, I cleared all of the .ssh/authorized_keys on each node. reran ssh-copy-id -i ~/.ssh/id_rsa.pub <node IP> for each node, INCLUDING the ip address for itself, I am not sure if this was necessary or not but I did it anyway. I did an rm -fR v_reporting_node0001/ on each node to clear the backup folders and reran vbr.py with no issues.

Just my $.02 on how I got this to work, I hope it helps someone...

Post Reply

Return to “Vertica Backup & Recovery”