Skip to content

Conversation

@slaskawi
Copy link
Contributor

@slaskawi slaskawi commented Sep 6, 2017

https://issues.jboss.org/browse/CLOUD-2001

This Pull Request represents the same functionality as implemented here: jgroups-extras/jgroups-kubernetes#36

The idea is to split clusters during the Rolling Update (see JIRA for more information) by setting split_clusters_during_rolling_update field in KUBE_PING. The implementation is based on Metadata part of the Pod Spec. It filters out all pods with different Deployment label. Since Rolling out a new updates this label (-n suffix), we ping only nodes from the same deployment.

@rcernich
Copy link
Contributor

rcernich commented Sep 6, 2017

Can the same functionality be implemented with DNS? If not, this feature would be specific to kube queries, which is OK. It would be nice if both dns and kube offered the same capabilities. Next question would be if this would work with multicast.

@slaskawi
Copy link
Contributor Author

slaskawi commented Sep 7, 2017

Hey @rcernich Sure, I can implement this add-on to DNS Ping as well.

Next question would be if this would work with multicast.

Yes, there should be no problem with that. However with Multicast network plugin you can use simple MPING, which is probably much faster (and easier to use since it requires 0 configuration) than any version of KUBE/DNS Pings.

@slaskawi
Copy link
Contributor Author

slaskawi commented Sep 7, 2017

Hey @rcernich Sure, I can implement this add-on to DNS Ping as well.

Sorry, I was mistaken. This needs to be KUBE_PING-only functionality. With DNS approach we have no access to Pod labels (this would require calling Kube API which is actually something that we want to avoid when doing DNS right?).

@slaskawi
Copy link
Contributor Author

slaskawi commented Sep 7, 2017

I though it might be helpful to describe how I tested it:

  1. Built new OpenShift PING libraries
  2. Booted Docker container: docker run -it --rm -u root jboss-datagrid-7/datagrid71-openshift bash
  3. Copied new libraries using docker cp
  4. Modified clustered-openshift.xml configuraion:
...
<stack name="tcp">
<transport type="TCP" socket-binding="jgroups-tcp"/>
<protocol type="openshift.KUBE_PING">
<property name="split_clusters_during_rolling_update">true</property>
</protocol>
...
  1. Used Docker commit
  2. Tested on oc cluster up when deploying new version of DC I got these logs:
12:27:46,490 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ViewHandler-9,finispan-server-3-57znm) ISPN000094: Received new cluster view for channel clustered: [finispan-server-3-57znm|2] (3) [finispan-server-3-57znm, finispan-server-3-ngpsz, finispan-server-3-jzv1k]
12:28:44,313 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ViewHandler-20,finispan-server-3-57znm) ISPN000094: Received new cluster view for channel clustered: [finispan-server-3-57znm|4] (3) [finispan-server-3-57znm, finispan-server-3-ngpsz, finispan-server-3-jzv1k]
12:28:54,428 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ViewHandler-22,finispan-server-3-57znm) ISPN000094: Received new cluster view for channel clustered: [finispan-server-3-57znm|5] (2) [finispan-server-3-57znm, finispan-server-3-ngpsz]
12:28:57,995 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ViewHandler-22,finispan-server-3-57znm) ISPN000094: Received new cluster view for channel clustered: [finispan-server-3-57znm|6] (1) [finispan-server-3-57znm]

Note there is no finispan-server-4 in there which is exactly what we expect.

@rcernich
Copy link
Contributor

rcernich commented Sep 7, 2017

With respect to the DNS bit, the DNS query needs to change to get a list of all endpoints, regardless of ready state. My working theory is that maybe there's something in the service record that can be used to identify the deployment, if not, so be it. It's been a long time since I've looked at this and I don't remember the query to use.

@slaskawi
Copy link
Contributor Author

slaskawi commented Sep 8, 2017

@rcernich Unfortunately I couldn't find any information about querying DNS with the respect to ready state. According to my knowledge, only Ready Pods are added to the Load Balancer (Service) and are visible in DNS.

I also looked at my notes from implementing DNS_PING [1][2][3] and ran a test against running sample app with DNS_PING. Here are the results of my queries:

$ oc get pods
NAME                       READY     STATUS    RESTARTS   AGE
jgroups-dns-ping-1-d1xcc   1/1       Running   1          15m
jgroups-dns-ping-1-j8bpk   1/1       Running   0          4m
jgroups-dns-ping-1-td0m7   1/1       Running   0          4m
jgroups-dns-ping-1-v2qrl   1/1       Running   0          4m
jgroups-dns-ping-1-v9vlg   1/1       Running   0          4m
rheltest                   1/1       Running   0          15m

$ oc get svc
NAME               CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
jgroups-dns-ping   None         <none>        8888/TCP   4m

$ oc run rheltest --image=registry.access.redhat.com/rhel7/rhel-tools --restart=Never --attach -i --tty

bash-4.2$ dig +search +short jgroups-dns-ping SRV
10 100 0 9189f4dd.jgroups-dns-ping.myproject.svc.cluster.local.
bash-4.2$ dig +search +short jgroups-dns-ping SRV
10 100 0 9189f4dd.jgroups-dns-ping.myproject.svc.cluster.local.

... other Pods became ready ...

dig +search +short jgroups-dns-ping SRV
10 25 0 9189f4dd.jgroups-dns-ping.myproject.svc.cluster.local.
10 25 0 9089f34a.jgroups-dns-ping.myproject.svc.cluster.local.
10 25 0 8f89f1b7.jgroups-dns-ping.myproject.svc.cluster.local.
10 25 0 8e89f024.jgroups-dns-ping.myproject.svc.cluster.local.
bash-4.2$ dig +search +short 9189f4dd.jgroups-dns-ping.myproject.svc.cluster.local.
172.17.0.4

Unfortunately I could not figure out how to query DNS to gather all Pods regardless to their Ready state. I also could not find any information about this feature on the web :(

@slaskawi slaskawi closed this Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants