Skip to content

Conversation

@brknkfr
Copy link

@brknkfr brknkfr commented Apr 16, 2021

For now, Nextcloud Talk gets one random turnservers of the specified turnservers in the backend.

This patch provides all in the backend specified turnservers to the client. So the chance might be higher, that calling in restricted cooperate networks will works successfully (btw, it's up to the client which turnserver to use).

Would probably fix #5267
Fix #3676

@nickvergessen
Copy link
Member

cc @plinss

@brknkfr
Copy link
Author

brknkfr commented Apr 16, 2021

I could need some help with creating working php tests ... (sql php stuff).

@nickvergessen
Copy link
Member

We can fix the tests in the end, no problem

@fancycode
Copy link
Member

Ideally you would first group the servers by scheme and transport and then return a random server of each group. This would allow load-balancing while keeping maximum possible connection possibilities for clients.

E.g. configure 2 turn-udp, 2 turn-tcp and 2 turns-tcp servers, give one of each to clients. With the change as-is, you could end up returning both turn/udp and one turn/tcp server which could prevent a client from connecting (that needs a turns server).

@brknkfr
Copy link
Author

brknkfr commented Apr 16, 2021

E.g. configure 2 turn-udp, 2 turn-tcp and 2 turns-tcp servers, give one of each to clients. With the change as-is, you could end up returning both turn/udp and one turn/tcp server which could prevent a client from connecting (that needs a turns server).

Now the patch takes all specified turnserver combinations and takes random turn-udp, one random turn-tcp and one random turns-tcp.

lib/Config.php Outdated
Comment on lines 309 to 311
(!empty($turn_udp)) ? $turnSettings[] = $turnservers[$turn_udp[array_rand($turn_udp, 1)]] : '';
(!empty($turn_tcp)) ? $turnSettings[] = $turnservers[$turn_tcp[array_rand($turn_tcp, 1)]] : '';
(!empty($turns_tcp)) ? $turnSettings[] = $turnservers[$turns_tcp[array_rand($turns_tcp, 1)]] : '';
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a turns_udp added aswell for the rare cases someone specifies only a turnserver with turns and udp (dtls over udp)?

@plinss
Copy link

plinss commented Apr 18, 2021

Rather than hard coding a selection for turn/udp, turn/tcp, and turns/tcp, I suggest you group the servers by scheme, protocol, and port, then select one from each group, regardless of how many groups there are. (That also addresses @brknkfr's point about turns_udp which I agree is worth adding.)

For example, I might specify two UDP turn servers on different ports: turn:turn.example1.com:3478?protocol=udp and turn:turn.example2.com:53?protocol=udp. The first using the standard turn port, the second using the DNS port for networks that block all UDP except DNS. No reason the client shouldn't get both.

@brknkfr
Copy link
Author

brknkfr commented Apr 19, 2021

As known from other sources, UDP-over-DTLS (turns_udp) is not a real benefit for webrtc, because webrtc is already encrypted. That said, TCP-over-TLS could help within very restricted networks, where only outgoing connections over port 443/tcp over TLS are allowed.

The more turnservers you specify, the longer it takes to establish a "successful" calling session. That's why for example firefox throws an error if you provide more than five turnservers (Chrome based browsers don't seem to have a problem with more than five turnservers). Generally three turnservers should be more than enough.

Maybe the nextcloud talk backend should be changed in a way, that you can not specify turns and turn and tcp and udp simultaneously, but only turns or turn and tcp or udp. This way you could even put the turnservers in a certain priority. Furthermore it should then provoke a warning, when you specify more than five turnservers in the backend, but I think, this change would require another pull request? (I don't know if I'm able to program or handle all those migration tasks for existing installations).

For now, I'd suggest to leave apart UDP-over-DTLS although it can be configured and to take a maximum of two turn-udp, two turn-tcp and one turns-tcp (with port 443 preferred) of the configured turnservers.

@brknkfr
Copy link
Author

brknkfr commented Apr 19, 2021

Okey, some more thoughts and assumptions:

  • I assume that loadbalancing should normalley not be done by the nextcloud web interface (aka php random chosen turnservers) but better in an adequate backend with adequate loadbalancing tools. It shouldn't be necessary to actually provide multiple turnservers.
  • The webrtc client shouldn't get more than - lets say - three turnservers, otherwise it will take a long time to establish a connections.
  • Webrtc traffic is encrypted, so UDP-over-TLS doesn't make any sense right now (neither does TCP-over-TLS regarding the traffic).
  • I assume that normally a nextcloud instance is running on port 443 with https, the most accessible and common port and aswell accessible by the webrtc stuff.
  • Most restrictive firewalls are blocking outgoing traffic. Mainly, outgoing udp traffic is blocked (but not outgoing tcp traffic). When it's worse, even outgoing tcp traffic is blocked. Even more restrictive firewalls only allow tcp-tls traffic over port 443.

So what now?

I changed the pull request like this: Provide one random udp turnserver and one random tcp turnserver (turn or turns) to the client. And if there is even a webserver looking turnserver (with turns over tcp on port 443), provide it too.

@plinss
Copy link

plinss commented Apr 20, 2021

@brknkfr I'm strongly opposed to only providing 3 turn servers when the admin configures more.

  1. I'm aware 3 is the current recommendation, but it's just that, a recommendation. If there's a need to deal with a combination of hostile networks and you need to set 4 servers, some users are going to fail some of the time. And in the current implementation, the failures will be random, impossible to understand and fix, and will cause all sorts of frustration. The worst case if you have too many is that connections take an extra fraction of a second to setup, the worst case if you have too little is a complete and total failure to connect for some of the users some of the time.
  2. How clients work is evolving, and multiple new clients ship every month. It can take possibly years to make changes on the server back end to deal with new client behaviors if all of a sudden the recommended number becomes, say, 5. Let alone the time to get admins to update their server settings.
  3. The choice of port is as important as the choice of scheme and protocol. Please don't restrict to one scheme/protocol pairing. I may very well need to specify 3 (or more) different turn/tcp ports to get past a given group of firewalls. The set of ports that coffee shop/hotel/pubic wifi hot-spots block is somewhat arbitrary and is frankly a dumpster fire. And while turns/tcp:443 is a pretty good escape hatch, not everyone will be able to run a TURN server with TLS on 443. As an example, I know of an IRC service that runs on about a dozen ports just to ensure all of its users can join from where they need to.

The whole point of this feature is to allow users to get around limitations they can't control. Let's not add more, and arbitrary ones at that.

It'd be fine to have some text in the UI (and possibly a link) recommending no more than 3 servers (and maybe only popping up when more than 3 are set), but let admins specify as many as they need to and make their own choices as to the tradeoffs. The situations will change over time, and you're only looking at a snapshot of current tech.

I'd also go ahead and add support for DTLS now. Yes, it provides no additional privacy or security, but like TLS, it may be needed to get past some firewalls that do deep packet inspection on UDP (I don't know of any offhand that do this today, but I don't know what's going to ship tomorrow, and I don't claim to have comprehensive knowledge of every firewall in existence to rule them out.)

Furthermore, clients already sub-set the list of turn servers based on their capabilities, and they set their own ordering based on their own reasons. There's no point in providing any ordering or semblance of prioritization in the UI as it won't do anything (today).

@brknkfr
Copy link
Author

brknkfr commented Apr 20, 2021

  1. I'm aware 3 is the current recommendation, but it's just that, a recommendation. If there's a need to deal with a combination of hostile networks and you need to set 4 servers, some users are going to fail some of the time. And in the current implementation, the failures will be random, impossible to understand and fix, and will cause all sorts of frustration. The worst case if you have too many is that connections take an extra fraction of a second to setup, the worst case if you have too little is a complete and total failure to connect for some of the users some of the time.

I agree. As it is right now when nextcloud talk just uses one random turnserver. That's the reason I'm digging into this now.

  1. How clients work is evolving, and multiple new clients ship every month. It can take possibly years to make changes on the server back end to deal with new client behaviors if all of a sudden the recommended number becomes, say, 5. Let alone the time to get admins to update their server settings.
    Agreed.

I just configured about ten different scheme/port/protocol turnservers in nextcloud and tried to establish a call session between two firefox browsers both behind very, very restricted networks. Although it took a little bit longer than direct calls, it worked without any problems. This "warning" of the usage of more than 5 turnservers probably can be ignored without problems. Maybe there are other clients out there which could have problems.

  1. The choice of port is as important as the choice of scheme and protocol. Please don't restrict to one scheme/protocol pairing. I may very well need to specify 3 (or more) different turn/tcp ports to get past a given group of firewalls. The set of ports that coffee shop/hotel/pubic wifi hot-spots block is somewhat arbitrary and is frankly a dumpster fire. And while turns/tcp:443 is a pretty good escape hatch, not everyone will be able to run a TURN server with TLS on 443. As an example, I know of an IRC service that runs on about a dozen ports just to ensure all of its users can join from where they need to.

So should the specified turnservers even ordered or sorted in some way or just be provided "as is" to the client? Any hints appreciated.

When specifying an order I would go with a priority like this:

  1. turn_udp (the fastest)
  2. turn_tcp (okey, outgoing udp traffic is blocked, try tcp)
  3. turns_tcp (hmmm, with some luck on port 443 with tls?)
  4. turns_udp (do we ever come to this case?)

I know, it's up the clients to decide, so that's why I'm asking.

It'd be fine to have some text in the UI (and possibly a link) recommending no more than 3 servers (and maybe only popping up when more than 3 are set), but let admins specify as many as they need to and make their own choices as to the tradeoffs. The situations will change over time, and you're only looking at a snapshot of current tech.

Imho, this is for another pull request or an documentation issue.

@nickvergessen
Copy link
Member

apart from that I think this is fine. If the admin adds more we just show a warning in the admin settings and done.

@nickvergessen nickvergessen added 3. to review client: 🤖🍏 mobile enhancement feature: WebRTC 🚡 WebRTC connection between browsers and/or mobile clients labels Apr 22, 2021
@nickvergessen nickvergessen added this to the 💖 Next Major (22) milestone Apr 22, 2021
Copy link
Member

@danxuliu danxuliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.

The url parameter must be kept, as it is needed by the iOS client. Please note that, for convenience, the returned settings resemble the RTCIceServer dictionary, so in the WebUI it is directly used when creating the PeerConnection. But the returned settings are just data passed to the clients, like any other data returned from the Talk server. Each client parses it on its own way and then uses the parsed data to call the WebRTC APIs as needed.

Specifically, the url parameter is used by the iOS client and the urls parameter is used by the Android client. The WebUI uses the turnservers object as is. Both parameters must be a list with a single element; each additional server must be sent in its own additional object in the turnservers list, the urls property can not be used for that. Of course it would be good if all this was cleaned, but in any case that should be done on its own specific pull request.

Besides that, composing the urls should be done in SignalingController::getSettings, as it is a formatting done specifically by the controller; Config::getTurnSettings should return the list of servers but with schemes, server and protocols as before (like done by @plinss in https://github.com/nextcloud/spreed/pull/4715/files#diff-a9facf8547f41c9326235cee2e92dcfeed102a5eb138475b054fe1f13028e80aR261-R270). Yes, right now it is not used anywhere else from the controller, but let's keep things generic when possible :-)

Independently of that, as explained before personally I would prefer not to just change the semantics of the configuration to return all the servers instead of a random one and do something more elaborate instead. Having said that, getting a random server for load balancing should be achievable by external means, while getting several different TURN servers in those specific setups that may need them is not. After discussing this internally it would be possible to change the behaviour for Talk 12 as long as we document it, so fine by me. I need to add some details to the TURN documentation about turns: and other related things, so I will add an explanation about that as well.

@nickvergessen
Copy link
Member

nickvergessen commented May 4, 2021

The url parameter must be kept, as it is needed by the iOS client.

We can change that and break it (if it is any good and better or clearer or ....), as there is no compatibility possible between Talk 11 (conversations api v1-3) and Talk 12 (conversations api v4) as we can not make the multiple sessions compatible with uniqueness, we can also add a breaking signaling thing on top, especially when it's just about changing a single character

@brknkfr
Copy link
Author

brknkfr commented May 4, 2021

Thank you for the review.

The url parameter must be kept, as it is needed by the iOS client.

I agree with @nickvergessen that we can break this when there is only a small change needed for the iOS client (a missing s) and probably the android client.

Besides that, composing the urls should be done in SignalingController::getSettings, as it is a formatting done specifically by the controller; Config::getTurnSettings should return the list of servers but with schemes, server and protocols as before (like done by @plinss in https://github.com/nextcloud/spreed/pull/4715/files#diff-a9facf8547f41c9326235cee2e92dcfeed102a5eb138475b054fe1f13028e80aR261-R270). Yes, right now it is not used anywhere else from the controller, but let's keep things generic when possible :-)

I agree to this point and I'm gonna change it.

Independently of that, as explained before personally I would prefer not to just change the semantics of the configuration to return all the servers instead of a random one and do something more elaborate instead. Having said that, getting a random server for load balancing should be achievable by external means, while getting several different TURN servers in those specific setups that may need them is not. After discussing this internally it would be possible to change the behaviour for Talk 12 as long as we document it, so fine by me. I need to add some details to the TURN documentation about turns: and other related things, so I will add an explanation about that as well.

I completely agree with you on this point and on your longish comment in the other pull request. Unfortunately, for me, the suggested or changes are "out of my scope".

@danxuliu
Copy link
Member

danxuliu commented May 6, 2021

I agree with @nickvergessen that we can break this when there is only a small change needed for the iOS client (a missing s) and probably the android client.

It is not just adding an s; other changes will be needed (as right now a single element is assumed in the app). But yes, the changes should not be very complex ;-)

I agree that this is a good time to do the change for the reasons mentioned above by @nickvergessen (and @Ivansss gave his blessing ;-) ). But again I think that it should be better to do that in a separate pull request for clarity and ease of reversion if needed (although that should not be needed, but just in case ;-) ).

Therefore I have added a fixup to keep the format, and I have extracted the format changes to their own pull request. I have also added another fixup commit to improve the tests.

Please remember to adjust the commit message after doing the fixup :-)

Copy link
Author

@brknkfr brknkfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unifying urls will be done #5582

@brknkfr brknkfr requested a review from danxuliu May 7, 2021 08:11
Copy link
Member

@danxuliu danxuliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested* and works 👍

*Only in the WebUI, although from reading the mobile apps code it should work as well.

Thanks!

@danxuliu
Copy link
Member

Oops, linter was not happy. I have added another fixup commit, can you rebase and squash it again? Thanks!

- Provides all in the backend specified turnservers to webrtc clients

Signed-off-by: Sebastian L <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4. to release client: 🤖🍏 mobile enhancement feature: WebRTC 🚡 WebRTC connection between browsers and/or mobile clients

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Random selection of TURN servers may result in failed connections Console Warnings (Firefox 76)

5 participants