Add filter to get most matched site or arbitrator #49

liu4480 · 2016-08-25T03:31:38Z

By design, arbitrator can not revoke tickets, and if the
commandline does not specify a site, it will try to find a "local site"
which has the same subnet with the machine that runs booth command.

If arbitrator and all sites are in the same subnet, booth will choose
the first one as "local site". Booth might think it is an arbitrator,
and will not execute revoke operations.

This patch add a filter to get most matched arbitrator/booth.

gao-yan · 2016-08-25T11:53:46Z

src/transport.c

-			*me = node;
-			did_match = EXACT_MATCH;
-			break;
+			if((matched_tmp < matched * 8)||((node->type == SITE)&&(matched_tmp == matched * 8)))


Does it even compile? It appears match_tmp is not declared ;)

And I don't actually get it here. If it's the exact match, why is it not the only one? Unless the node is running as both a site and an arbitrator?

The travis build failed. Anyway, I'm not sure if I understand the problem here. If a request for the ticket is received at the booth member which is not the ticket leader, that request is going to be forwarded to the leader. @liu4480, did you observe this actually happening? Can you please post your configuration and exact circumstances.

Hi Dejan, @krig encountered this issue that "booth revoke " fails:

ha2:~ # booth list
ticket: ticketA, leader: 10.12.2.201, expires: 2016-08-23 08:29:25
ticket: ticketB, leader: NONE
ha2:~ # booth revoke ticketA
Aug 23 08:19:51 ha2 booth: [5610]: ERROR: We're just an arbitrator, cannot grant/revoke tickets here.

But ha2 is not the arbitrator, it is a node in the cluster currently holding the ticket.

Configuration file:

# The booth configuration file is "/etc/booth/booth.conf". You need to # prepare the same booth configuration file on each arbitrator and # each node in the cluster sites where the booth daemon can be launched. # Here is an example of the configuration file: # "transport" means which transport layer booth daemon will use. # Currently only "UDP" is supported. transport="UDP" # The port that booth daemons will use to talk to each other. port="9929" # The arbitrator IP. If you want to configure several arbitrators, # you need to configure each arbitrator with a separate line. arbitrator="10.12.2.105" # The site IP. The cluster site uses this IP to talk to other sites. # Like arbitrator, you need to configure each site with a separate line. site="10.12.2.201" site="10.12.2.202" # The ticket name, which corresponds to a set of resources which can be # fail-overed among different sites. ticket="ticketA" ticket="ticketB" expire = 600 weights = 1,2,3

It sounds like it may pick a wrong one as the "local site" if the sites and the arbitrator are in the same subnet. There seems to be some problem in the logic around the "EXACT_MATCH".

I've also observed "same subnet" vs. fuzzy matching issues.

On Thu, Aug 25, 2016 at 06:20:44AM -0700, Jan Pokorný wrote:

@@ -108,10 +108,13 @@ static int find_address(unsigned char ipaddr[BOOTH_IPADDR_LEN],
break;

if (matched == node->addrlen) {

*address_bits_matched = matched \* 8;

*me = node;

did_match = EXACT_MATCH;

break;

if((matched_tmp < matched \* 8)||((node->type == SITE)&&(matched_tmp == matched \* 8)))

I've also observed "same subnet" vs. fuzzy matching issues.

Before or after this commit:

b2e06e8

Namely, some cloud testers complained about the issue, but after
this it worked for them (and for me).

It is before this commit.

Ah, this explains why I'm not able to produce it with the upstream code;) Thanks for pointing out, Dejan.

On Fri, Aug 26, 2016 at 02:56:33AM -0700, Gao,Yan wrote:

Ah, this explains why I'm not able to produce it with the upstream code;) Thanks for pointing out, Dejan.

OK, good :) I guess that we can close this issue then?

@dmuhamedagic, it was with that commit already contained: #52

liu4480 · 2016-08-26T01:42:48Z

@gao-yan sorry for the ignorance.
I mean, if you have a site and arbitrator in the same subnet, there are two issues so far: 1)if you execute the command on a third node in the site, it will choose the arbitrator as local. 2) if you execute the command the booth site node, it still think arbitrator as local.

By design, arbitrator can not revoke tickets, and if the commandline does not specify a site, it will try to find a "local site" which has the same subnet with the machine that runs booth command. If arbitrator and all sites are in the same subnet, booth will choose the first one as "local site". Booth might think it is an arbitrator, and will not execute revoke operations. This patch add a filter to get most matched arbitrator/booth.

liu4480 · 2016-08-26T04:53:05Z

I 've the vms running booth in 192.168.122.0/24, with the following configuration:
port = 9929
transport = UDP
authfile = /etc/booth/booth.key
arbitrator = 192.168.122.83
site = 192.168.122.143
site = 192.168.122.62
ticket = dummy
retries = 5
expire = 900
timeout = 10

krig · 2016-08-26T05:33:13Z

Cluster node ha2:

vagrant@ha2:~> ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:82:dd:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe82:dd06/64 scope link 
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:0d:96:01 brd ff:ff:ff:ff:ff:ff
    inet 10.12.2.102/24 brd 10.12.2.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe0d:9601/64 scope link 
       valid_lft forever preferred_lft forever

Arbitrator ha5:

vagrant@ha5:~> ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:82:dd:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe82:dd06/64 scope link 
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:fe:60:fb brd ff:ff:ff:ff:ff:ff
    inet 10.12.2.105/24 brd 10.12.2.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fefe:60fb/64 scope link 
       valid_lft forever preferred_lft forever

gao-yan reviewed Aug 25, 2016
View reviewed changes

liu4480 closed this Aug 29, 2016

Add filter to get most matched site or arbitrator #49

Add filter to get most matched site or arbitrator #49

Uh oh!

Conversation

liu4480 commented Aug 25, 2016

Uh oh!

gao-yan Aug 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gao-yan Aug 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liu4480 commented Aug 26, 2016

Uh oh!

liu4480 commented Aug 26, 2016

Uh oh!

krig commented Aug 26, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gao-yan Aug 25, 2016 •

edited

Loading

gao-yan Aug 25, 2016 •

edited

Loading