-
Notifications
You must be signed in to change notification settings - Fork 26
Add filter to get most matched site or arbitrator #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| *me = node; | ||
| did_match = EXACT_MATCH; | ||
| break; | ||
| if((matched_tmp < matched * 8)||((node->type == SITE)&&(matched_tmp == matched * 8))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it even compile? It appears match_tmp is not declared ;)
And I don't actually get it here. If it's the exact match, why is it not the only one? Unless the node is running as both a site and an arbitrator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The travis build failed. Anyway, I'm not sure if I understand the problem here. If a request for the ticket is received at the booth member which is not the ticket leader, that request is going to be forwarded to the leader. @liu4480, did you observe this actually happening? Can you please post your configuration and exact circumstances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Dejan, @krig encountered this issue that "booth revoke " fails:
ha2:~ # booth list
ticket: ticketA, leader: 10.12.2.201, expires: 2016-08-23 08:29:25
ticket: ticketB, leader: NONE
ha2:~ # booth revoke ticketA
Aug 23 08:19:51 ha2 booth: [5610]: ERROR: We're just an arbitrator, cannot grant/revoke tickets here.
But ha2 is not the arbitrator, it is a node in the cluster currently holding the ticket.
Configuration file:
# The booth configuration file is "/etc/booth/booth.conf". You need to
# prepare the same booth configuration file on each arbitrator and
# each node in the cluster sites where the booth daemon can be launched.
# Here is an example of the configuration file:
# "transport" means which transport layer booth daemon will use.
# Currently only "UDP" is supported.
transport="UDP"
# The port that booth daemons will use to talk to each other.
port="9929"
# The arbitrator IP. If you want to configure several arbitrators,
# you need to configure each arbitrator with a separate line.
arbitrator="10.12.2.105"
# The site IP. The cluster site uses this IP to talk to other sites.
# Like arbitrator, you need to configure each site with a separate line.
site="10.12.2.201"
site="10.12.2.202"
# The ticket name, which corresponds to a set of resources which can be
# fail-overed among different sites.
ticket="ticketA"
ticket="ticketB"
expire = 600
weights = 1,2,3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like it may pick a wrong one as the "local site" if the sites and the arbitrator are in the same subnet. There seems to be some problem in the logic around the "EXACT_MATCH".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also observed "same subnet" vs. fuzzy matching issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Thu, Aug 25, 2016 at 06:20:44AM -0700, Jan Pokorný wrote:
@@ -108,10 +108,13 @@ static int find_address(unsigned char ipaddr[BOOTH_IPADDR_LEN],
break;if (matched == node->addrlen) {
*address_bits_matched = matched \* 8;*me = node;did_match = EXACT_MATCH;break;if((matched_tmp < matched \* 8)||((node->type == SITE)&&(matched_tmp == matched \* 8)))I've also observed "same subnet" vs. fuzzy matching issues.
Before or after this commit:
Namely, some cloud testers complained about the issue, but after
this it worked for them (and for me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is before this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this explains why I'm not able to produce it with the upstream code;) Thanks for pointing out, Dejan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Fri, Aug 26, 2016 at 02:56:33AM -0700, Gao,Yan wrote:
Ah, this explains why I'm not able to produce it with the upstream code;) Thanks for pointing out, Dejan.
OK, good :) I guess that we can close this issue then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmuhamedagic, it was with that commit already contained: #52
|
@gao-yan sorry for the ignorance. |
By design, arbitrator can not revoke tickets, and if the commandline does not specify a site, it will try to find a "local site" which has the same subnet with the machine that runs booth command. If arbitrator and all sites are in the same subnet, booth will choose the first one as "local site". Booth might think it is an arbitrator, and will not execute revoke operations. This patch add a filter to get most matched arbitrator/booth.
|
I 've the vms running booth in 192.168.122.0/24, with the following configuration: |
|
Cluster node Arbitrator |
By design, arbitrator can not revoke tickets, and if the
commandline does not specify a site, it will try to find a "local site"
which has the same subnet with the machine that runs booth command.
If arbitrator and all sites are in the same subnet, booth will choose
the first one as "local site". Booth might think it is an arbitrator,
and will not execute revoke operations.
This patch add a filter to get most matched arbitrator/booth.