CLOUD-2001 Split clusters during rolling update #35

slaskawi · 2017-09-06T12:22:27Z

https://issues.jboss.org/browse/CLOUD-2001

This Pull Request represents the same functionality as implemented here: jgroups-extras/jgroups-kubernetes#36

The idea is to split clusters during the Rolling Update (see JIRA for more information) by setting split_clusters_during_rolling_update field in KUBE_PING. The implementation is based on Metadata part of the Pod Spec. It filters out all pods with different Deployment label. Since Rolling out a new updates this label (-n suffix), we ping only nodes from the same deployment.

rcernich · 2017-09-06T15:51:40Z

Can the same functionality be implemented with DNS? If not, this feature would be specific to kube queries, which is OK. It would be nice if both dns and kube offered the same capabilities. Next question would be if this would work with multicast.

slaskawi · 2017-09-07T09:59:51Z

Hey @rcernich Sure, I can implement this add-on to DNS Ping as well.

Next question would be if this would work with multicast.

Yes, there should be no problem with that. However with Multicast network plugin you can use simple MPING, which is probably much faster (and easier to use since it requires 0 configuration) than any version of KUBE/DNS Pings.

slaskawi · 2017-09-07T11:11:56Z

Hey @rcernich Sure, I can implement this add-on to DNS Ping as well.

Sorry, I was mistaken. This needs to be KUBE_PING-only functionality. With DNS approach we have no access to Pod labels (this would require calling Kube API which is actually something that we want to avoid when doing DNS right?).

slaskawi · 2017-09-07T12:33:34Z

I though it might be helpful to describe how I tested it:

Built new OpenShift PING libraries
Booted Docker container: docker run -it --rm -u root jboss-datagrid-7/datagrid71-openshift bash
Copied new libraries using docker cp
Modified clustered-openshift.xml configuraion:

...
<stack name="tcp">
<transport type="TCP" socket-binding="jgroups-tcp"/>
<protocol type="openshift.KUBE_PING">
<property name="split_clusters_during_rolling_update">true</property>
</protocol>
...

Used Docker commit
Tested on oc cluster up when deploying new version of DC I got these logs:

12:27:46,490 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ViewHandler-9,finispan-server-3-57znm) ISPN000094: Received new cluster view for channel clustered: [finispan-server-3-57znm|2] (3) [finispan-server-3-57znm, finispan-server-3-ngpsz, finispan-server-3-jzv1k]
12:28:44,313 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ViewHandler-20,finispan-server-3-57znm) ISPN000094: Received new cluster view for channel clustered: [finispan-server-3-57znm|4] (3) [finispan-server-3-57znm, finispan-server-3-ngpsz, finispan-server-3-jzv1k]
12:28:54,428 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ViewHandler-22,finispan-server-3-57znm) ISPN000094: Received new cluster view for channel clustered: [finispan-server-3-57znm|5] (2) [finispan-server-3-57znm, finispan-server-3-ngpsz]
12:28:57,995 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ViewHandler-22,finispan-server-3-57znm) ISPN000094: Received new cluster view for channel clustered: [finispan-server-3-57znm|6] (1) [finispan-server-3-57znm]

Note there is no finispan-server-4 in there which is exactly what we expect.

rcernich · 2017-09-07T15:45:24Z

With respect to the DNS bit, the DNS query needs to change to get a list of all endpoints, regardless of ready state. My working theory is that maybe there's something in the service record that can be used to identify the deployment, if not, so be it. It's been a long time since I've looked at this and I don't remember the query to use.

slaskawi · 2017-09-08T07:01:19Z

@rcernich Unfortunately I couldn't find any information about querying DNS with the respect to ready state. According to my knowledge, only Ready Pods are added to the Load Balancer (Service) and are visible in DNS.

I also looked at my notes from implementing DNS_PING [1][2][3] and ran a test against running sample app with DNS_PING. Here are the results of my queries:

$ oc get pods
NAME                       READY     STATUS    RESTARTS   AGE
jgroups-dns-ping-1-d1xcc   1/1       Running   1          15m
jgroups-dns-ping-1-j8bpk   1/1       Running   0          4m
jgroups-dns-ping-1-td0m7   1/1       Running   0          4m
jgroups-dns-ping-1-v2qrl   1/1       Running   0          4m
jgroups-dns-ping-1-v9vlg   1/1       Running   0          4m
rheltest                   1/1       Running   0          15m

$ oc get svc
NAME               CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
jgroups-dns-ping   None         <none>        8888/TCP   4m

$ oc run rheltest --image=registry.access.redhat.com/rhel7/rhel-tools --restart=Never --attach -i --tty

bash-4.2$ dig +search +short jgroups-dns-ping SRV
10 100 0 9189f4dd.jgroups-dns-ping.myproject.svc.cluster.local.
bash-4.2$ dig +search +short jgroups-dns-ping SRV
10 100 0 9189f4dd.jgroups-dns-ping.myproject.svc.cluster.local.

... other Pods became ready ...

dig +search +short jgroups-dns-ping SRV
10 25 0 9189f4dd.jgroups-dns-ping.myproject.svc.cluster.local.
10 25 0 9089f34a.jgroups-dns-ping.myproject.svc.cluster.local.
10 25 0 8f89f1b7.jgroups-dns-ping.myproject.svc.cluster.local.
10 25 0 8e89f024.jgroups-dns-ping.myproject.svc.cluster.local.
bash-4.2$ dig +search +short 9189f4dd.jgroups-dns-ping.myproject.svc.cluster.local.
172.17.0.4

Unfortunately I could not figure out how to query DNS to gather all Pods regardless to their Ready state. I also could not find any information about this feature on the web :(

CLOUD-2001 Split clusters during rolling update

b036303

slaskawi closed this Mar 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CLOUD-2001 Split clusters during rolling update #35

CLOUD-2001 Split clusters during rolling update #35

Uh oh!

slaskawi commented Sep 6, 2017

Uh oh!

rcernich commented Sep 6, 2017

Uh oh!

slaskawi commented Sep 7, 2017

Uh oh!

slaskawi commented Sep 7, 2017

Uh oh!

slaskawi commented Sep 7, 2017

Uh oh!

rcernich commented Sep 7, 2017

Uh oh!

slaskawi commented Sep 8, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLOUD-2001 Split clusters during rolling update #35

CLOUD-2001 Split clusters during rolling update #35

Uh oh!

Conversation

slaskawi commented Sep 6, 2017

Uh oh!

rcernich commented Sep 6, 2017

Uh oh!

slaskawi commented Sep 7, 2017

Uh oh!

slaskawi commented Sep 7, 2017

Uh oh!

slaskawi commented Sep 7, 2017

Uh oh!

rcernich commented Sep 7, 2017

Uh oh!

slaskawi commented Sep 8, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants