Page 2 of 4

Re: Vertica Nodes Randomly Fail

Posted: Wed Jul 17, 2013 3:23 am
by becky
Hi,

Unfortunately my node 3 failed again.

Here is my /opt/vertica/config/vspread.conf file:

Code: Select all

Spread_Segment XXX.XXX.XXX.255:4803 {
  NXXXXXXXXX131    XXX.XXX.XXX.131 {
    XXX.XXX.XXX.131
    127.0.0.1
  }
  NXXXXXXXXX179    XXX.XXX.XXX.179 {
    XXX.XXX.XXX.179
    127.0.0.1
  }
  NXXXXXXXXX180    XXX.XXX.XXX.180 {
    XXX.XXX.XXX.180
    127.0.0.1
  }
}
EventLogFile = /dev/null
EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"
DaemonUser = spread
DaemonGroup = verticadba
Does that look right?

Re: Vertica Nodes Randomly Fail

Posted: Wed Jul 17, 2013 3:59 am
by becky
Hi Scutter,

Sorry, I messed up on the first attempt at reconfiguring the spread.

I re-ran the vertical_install like this:
  • /opt/vertica/sbin/install_vertica -T -S default -s vertica01,vertica02,vertica03 -r vertica-6.1.2-0.x86_64.RHEL5.rpm
Although, after running the above command, the /opt/vertica/config/vspread.conf file did not change.

I'll let you know if the nodes stay up...

Re: Vertica Nodes Randomly Fail

Posted: Wed Jul 17, 2013 12:20 pm
by becky
Hi,

Bad news... woke up this morning and all three nodes were down :( Seems like one failed, then later another failed bringing down the whole database.

This is baffling!

Re: Vertica Nodes Randomly Fail

Posted: Wed Jul 17, 2013 2:31 pm
by scutter
Try renaming the vspread.conf on all of the nodes and then re-running install_vertica again with -T -S default then check to see if you get the multiple Spread_Segments.

Re: Vertica Nodes Randomly Fail

Posted: Wed Jul 17, 2013 4:02 pm
by becky
Hi Scutter,

I renamed the /opt/vertica/config/vspread.conf to vspread.conf_old. Then I reinstalled Vertica. After the install, I checked the new vspread.conf file and it looks exactly the same as the old one.

Bummer...

Re: Vertica Nodes Randomly Fail

Posted: Wed Jul 17, 2013 4:21 pm
by scutter
When I have a chance, I'll try the reconfiguration on a couple of my VMs to see what I get.

Re: Vertica Nodes Randomly Fail

Posted: Thu Jul 18, 2013 1:24 am
by scutter
Hi Becky,

I created a 2-node cluster using 6.1.2 and with -T -S default with install_vertica, and I get the expected multiple Spread_Segments:


Spread_Segment 192.168.1.10:4803 {
N192168001010 192.168.1.10 {
192.168.1.10
127.0.0.1
}
}
Spread_Segment 192.168.1.12:4803 {
N192168001012 192.168.1.12 {
192.168.1.12
127.0.0.1
}
}

You renamed the vspread.conf on all three nodes before reinstalling? I suggest continuing to look for the reason why you're not getting spread reconfigured properly so that you can rule out whether -T helps your situation.

--Sharon