-
Notifications
You must be signed in to change notification settings - Fork 509
Add support for turn:/turns: schemes on turn server settings #4715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
45b2255 to
5d2c6cb
Compare
…n server settings Fixes nextcloud#257 Signed-off-by: Peter Linss <[email protected]>
5d2c6cb to
750c583
Compare
|
note the drone test failures don't appear to be related to this change |
|
I restarted the drone test, as at least on nextcloud/server this mostly resolved itself then, not sure about this inconsistency, but let's see. Awesome that even STUNS is supported as well with this. While it would have been probably easier (faster to review/discuss) to have TURNS/STUNS support and random server choice removal in separate PRs, I agree with it. I didn't recognise until that it is possible to add multiple servers to the list, but choosing one randomly to pass to the client does not sound like a reasonable implementation. Although, probably when the intention was split load across multiple servers? However, since with "UDP + TCP" two URLs were already passed as ordered array, adding further URLs to the list should work fine with clients as well. It should probably be added to the text above the input fields that clients will try to use the configured servers from top to bottom, each first with UDP, then TCP? A little but notable additional enhancement here is, that one can (but not need to) add the protocol And just to not forget about the discussion around the GUI-side implementation: I hope I find time the next days to test it, but am currently busy with an own release. |
|
Thanks for taking a look @MichaIng. I did consider splitting the turns support and fallback behavior into two PRs, but due to API changes one would have been dependent on the other (or a third would have been required to combine them). While it would have made discussion easier, it would have made testing and merging more difficult, so I made a call and opted for what I hoped would get this landed quickest. Having turns support also really makes more sense in a context that allows specifying fallback order. The random selection also didn't make sense to me without turns in the mix. If the admin specifies multiple TURN servers, you run the chance that clients may be given different servers. Given the role of the TURN server is to relay traffic between the clients, having them connect to different servers would be bad, and somewhat defeat the point. Specifying multiple methods to connect to the same server makes sense, but in that case you really want a fallback behavior so you can specify more performant, but less likely to work methods first, and then fall back to slower, but more reliable ports and protocols in a deterministic order. If load balancing is required for heavy loads, one would want to put the TURN server behind a load balancer (with logic to select the same backend for connections that come in close to each other). (FWIW Coturn recently added support for Proxy protocol making it easier to implement behind a load balancer.) STUN is such a lightweight protocol that load balancing doesn't make much sense. Fallback does for when a server is down, but then one would likely want to specify a private STUN server first, then fall back to public servers. I considered adding more explanatory text to the UI, but given the existing text already has a link to the docs, and the subtleties involved, I thought the docs would be a better place to give detailed explanations. Given that I also considered adding a UI control to select the URL scheme, but felt that might do more harm than good. First, adding something like 'TURN-TLS' is not accurate given that with UDP it's really 'TURN-DTLS'; having something like 'turn'/'turns'/'turn + turns' is adding more acronyms that may not make sense to many users; and adding something like 'secure on/off' is going to entice people to turn it on thinking it improves their security, when that's really not the purpose (or benefit). Second, given the desired fallback ordering, having a 'turn + turns' option may not be of much use. e.g. my setup is currently: This allows people behind somewhat open firewalls to use the most performant option first, then for those that block 'unusual' ports switching to UDP over 443 (which may not be the first choice because sharing port 443 with a web server may require a proxy and be slower/increase server load), then TCP on 3478 for firewalls that block UDP, then finally falling back to turns on 443. This ordering isn't served by UDP+TCP, let alone turn+turns*. Keeping this as a 'power user' feature explained only in the docs may actually be for the best. But I'm open to exploring better UI/UX. (* This ordering and set of options should be able to penetrate all firewalls except those that perform TLS decryption combined with deep packet inspection, which are starting to be a thing. Beating those would require something like TURN over HTTPS, akin to DoH, but that's a bit out of scope here.) |
I guessed that only the first client, which initiates the call, does the TURN server "choice", while when inviting other clients, the signalling server likely forwards the one used TURN server to the others as well? If it was different and each client was randomly trying to connect to different TURN servers while all wanted to share a single video call, then, yeah... who was implementing it 💯 🧠. Not sure if it's possible to have multiple clients connecting to the same TURN server (same video call) with different protocols, e.g. each traversing it's own particular firewall/NAT with the best performing method. If my guess is wrong, one question also is, if the client connects to the "wrong" TURN server that does not serve the video call the client wants to join, if then automatically the next passed server in row is attempted, i.e. which cases trigger a client to try the next given TURN server (or protocol combination).
One could still add two, three of four servers with each only allowing a single protocol, but allowing a single entry which serves all four combinations in a reasonable order would be simpler 😉.
I fully agree, which is why I would vote for a clear and complete documentation before making it transparent via selectors. And above I asked already a bunch of new probably relevant questions which could be sorted by a good documentation. |
I don't know the particularities of the implementation enough to know if the choice of TURN server is shared among the clients or not. A quick glance at the TURN RFC seems to indicate that the first client does indeed specify the connection protocol, but it appears to be optionally specified at later stages (the RFC does say to reject requests that don't include the protocol, but it also states UDP only, so this requirement may not exist in practice). This is a question for people more versed in the lower levels of TURN server behavior and implementation details than me. I think it's pretty important that different clients be allowed to connect in different manners. E.g. if my firewall allows me to do UDP over 3478, there's no guarantee that yours will, you may need TURN+TLS over 443. The first one to connect to the TURN server can't reliably make the choice for others. Maybe if the second fails it starts over and they try the method the second used? It may be possible for all clients to connect to the TURN server but not share a common protocol that works for all. I can understand not being able to mix UDP and TCP connections, but mixing TLS and TCP or UDP and DTLS should be doable. It does allow clients to vary between IPv4 and IPv6. Understanding this would be important to be able to give administrators the best setup advice (or implement the best automatic protocol fallback behavior).
Agreed, but when switching protocols you also often have to switch ports. So specifying the port/protocol pairing would have to be a bit different. Having a single specification for a server with different protocol/port options could work, but would require re-configuring the UI a bit. e.g. specify the server name, then a list of ports/protocols for each server (this would also allow specifying a single secret per server).
If plain UDP doesn't work, then adding DTLS probably isn't going to help traverse the firewall. Unless the firewall does deep packet inspection and blocks certain UDP packets I suppose. I don't know of any that do this though, and the much more common case is that firewalls simply block all UDP traffic (except sometimes DNS). Each extra possibility is just going to slow down the connection attempt (possibly quite long depending on timeouts), so after a few attempts it's probably best to just drop to something likely to work rather than trying every possible combination. |
|
@danxuliu will look into this mostlikely this week |
danxuliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry a lot for the delay, and for the wall of text that follows :-P
As much as I love symmetry and consistency :-) in this case adding support for stuns: would be problematic, because it is not supported neither in Firefox nor in Chromium. Even more, if a STUN server is checked using stuns: the browser will not fallback to stun:, it will simply fail. Therefore I would remove the support for stuns: and leave only stun: like it was before.
I agree that the random selection of servers is a surprising behaviour (although it is consistent with the behaviour when adding more HPB servers). But it should not be simply removed, as that would break any instance using it (essentially, large instances). Also, even if a better strategy for load balancing would be desirable at least this provides an easy way to provide it. It is similar to the random list of HPB servers and the clustering system provided by HPB servers themselves.
However, the random selection of TURN servers would be probably unneeded for small instances not using an HPB. To simplifiy the UI a possibility could be to hide the option to add other servers if the HPB is not configured, in the same way that it is not possible to add other HPB servers if not in clustered conversations mode (which is not the clustering system provided by HPB servers).
Also, although I agree that the details should be included in the documentation, it could be good to add a small hint in the UI that explaines that the TURN servers will be randomly selected.
However, that still leaves open the question of how to define a fallback list of servers.
It should be taken into account that the list of ICE servers does not behave as a fallback list in the sense of "try with the first one, if it does not work, then try with the second one" and so on. The ICE RFC does not mandate any behaviour when multiple STUN or TURN servers are available, and Firefox and Chromium use every server in the list to gather candidates.
In fact it seems that even duplicated ICE servers (really duplicated, that is, same scheme, address, port and transport) are checked as many times as defined for candidates (if they are duplicated I would have expected them to be merged 🤷). However, I am not sure if the duplicated candidates are then merged or they are also paired with every other remote candidate.
Independently of that, note that although originally it was needed to wait until all the candidates were gathered to start checking them there is an extension to ICE, Trickle ICE, to start the connectivity checks as soon as a candidate is found. Nevertheless this does not change the previously described behaviour: every server in the list will be used to gather candidates (which also affects the used quota of the servers during the negotiation phase, as explained in the documentation for the total-quota parameter). In fact if a connectivity check succeeds for a candidate pair and later a better one is found the second one will be used instead.
Whether a candidate is better than other depends on their priority. The ICE RFC recommends a formula to compute the priority which takes into account the type of the candidate (host, server reflexive, peer reflexive and relay), but not whether the transport is UDP or TCP. However, for the same type, Firefox prioritizes UDP over TCP candidates (the algorithm comes from nICEr) and Chromium prioritizes UDP over TCP candidates and also TCP candidates over TLS candidates.
All this means that the order in which the ICE servers appear in the list does not guarantee anything regarding the priority of their candidates. If there are two different TURN servers with the same scheme and protocol then the browser will probably assign a higher priority to the candidates of the first server defined. However, the browser could even ignore that, and I think that I read somewhere (sorry, fuzzy memory :-) ) that Chromium would even take into account the latency of the connection to the TURN server to prioritize it (although I can not confirm that; I think that I spotted some prioritization in Chromium's code indirectly based on the interface type (ethernet, wi-fi, 3g...) used for the connection if there are several interfaces, but I am not aware of seeing anything related to latency from the remote server 🤷).
In any case, the most common scenario for configuring a TURN server would be using a single server and a single port with different schemes and transports, and in that case all of the above would not be a problem. Due to this, for simplicity, and also to be able to include TURNS support even in a basic form in Talk 11, I have created another pull request that just adds the scheme selector (basically I used a commit from some months ago that I had lying around until I could do further tests, sorry for not pushing it before 🤦).
Nevertheless I agree that in some specific cases it would be good to be able to configure additional servers or ports. For that an option could be to adjust the UI so the main plus button adds a "set" of TURN servers (which are still randomly selected), and then another plus button adds more TURN servers to a given set (the second button could be, for example, to the right of each TURN server in a set, which would make possible to add TURN servers at an arbitrary position in the set). Each TURN server would have the same selectors and input fields as the current ones, only the plus button to its right and a way to separate the sets would be needed. How to store the server sets and also an automatic way to convert the previous configuration to the new one would still need to be defined, though.
Being able to separate the configuration could be useful even if a single server with a single port is used as, although convenient, the simplified configuration with scheme and transport selector is not perfect either.
Both Firefox and Chromium ignore the UDP transport and force TCP when used with a turns: scheme. That is, DTLS is not supported in any of them, and turns:address:port?transport=udp is silently converted to TLS. Therefore, if both turns:address:port?transport=udp and turns:address:port?transport=tcp are specified in the ICE servers in practice they end being a duplicated turns:address:port?transport=tcp.
Finally, regarding TURN servers please note that they do not host calls. A TURN server simply relays the data that it receives to another peer. The connection between the client and the TURN server can be done using UDP, TCP or TLS. On the other hand, the TURN server will relay the data always using UDP (at least it was the case some time ago; maybe that changed, but I have not recently checked it).
Two participants should be able to connect to two different TURN servers in the same call as long as the TURN servers can talk between each other. They should be able to connect to the same server too each one using a different protocol; in that case the TURN server would loopback the relayed UDP data to itself. At least, in theory; I am afraid that I have not tested this, although maybe I can find some time for it at a later point.
To sum up, I would do the following changes:
- Restore settings to only handle STUN and not STUNS too
- Keep the random server list, although communicate it better
- To simplify the settings maybe allow to add random servers only when an HPB is set
- Maybe add server "sets", but how to store the information and also convert previous configuration to the new one?
- Add selector for "turn:", "turns:" and "turn:/turns:"; add "stun:" in front of the stun field - Already done in #5086
- Document that TURNS will always use TCP - Probably in the followup to #5086 for documentation
|
Many thanks for the technical insights, very interesting to read, and great to know that principally multiple clients can connect to different TURN servers and different protocols/schemes, still being in the same Talk session. But btw, when multiple TURN servers are involved, does that mean that the data flow is Client1 => TURN1 => TURN2 => Client2? That wouldn't make any sense for load balancing, as now simply both TURN servers are fully loaded, isn't it? Or is this what the HPB then takes care of, to delegate streams for a large number of participants through multiple TURN servers? I think I still didn't fully understand how both work together, as initially I thought that the HPB basically replaces the TURN server, so it's either or, not both in combination. I mostly, based on this knowledge, agree with your points, but I would not limit the configuration abilities only based on support of two big browsers. There are many other browsers and other clients possible and support might change, which means the Nextcloud Talk would again need to touch that code the follow them. I think a complete documentation instead is important that states the currently known limitations. So I personally would keep allowing to use STUNS, adding a note to the docs that at time of writing, Chromium and Firefox do not support it. This also matches to how you suggest to do with DTLS: Keep the ability to configure it, but document that in most cases it will end up being TLS. Also thanks for already having TURNS support implemented. I also like the selector more than a free input, because there are in fact no other options for this technology, and then it's safer and clearer to be explicit rather than possibly causing admins to try out HTTP or other things, e.g. when it does not work in the first place. However, probably it would have been nice to communicate your PR prior to merging or even starting to work on it. Now we have a number of conflicts here and I'm not sure how big the motivation of plinss is to resolve all of this and follow your suggestions, as then, eehm, I think nothing will be added at all by this PR? No offence, just an idea how to reduce wasted coding time when multiple contributors deal with the same thing 😉. |
Are there really? There is Edge, Safari and Opera too, but they are Chrome-based and use the same behaviour mostlikely. For desktop that covers 96% but 2% are IE which can not do webrtc at all. So it leaves a gap of 2% unknown browsers. Not really worth the troubles from my POV. Let's focus on those browsers and make sure they have the best experience possible. |
|
Indeed standalone WebRTC clients seem to be rare, I thought there would be more. There was OpenWebRTC, but it's not officially maintained anymore. GStreamer supports it for a while. But I didn't think about is that one requires a browser anyway to use the Nextcloud web UI for it 😄. How does the Android NC Talk app do it, via Android WebView, i.e. Chromium? Regardless, if it's about a single selector that similarly to TURNS allows to traverse certain firewalls while similarly to TURNS having some support limitations that can be documented, I'd still vote for it. |
As mentioned several TURN servers for load balancing is most likely used in conjunction with the HPB. When the HPB is used the flow would be Client1 <=> TURN1 <=> HPB and Client2 <=> TURN2 <=> HPB. In this scenario having one or more than one TURN server would not change the required bandwidth.
If there is no HPB there is a P2P connection between each client. With the HPB there is an upload connection from each client to the HPB, and as many download connections as participants for each client from the HPB. The HPB reduces the required upload bandwidth needed by the clients, but it does not replace the TURN server. Both need to be used together.
It is in my TODO.
Yes, totally. It is what I should have done and, moreover, what I intended to do, but unfortunately what you intend to do and what you are able to do sometimes is not the same :-(
Yes, I take the full blame for that. My sincere apologies to @plinss, this should not have happened.
This PR could still add being able to add more than one TURN server for each "set" of random servers.
No offence taken :-)
It uses a WebRTC library.
Do you mean STUNS? STUNS would not provide you more connectivity. If you require TLS to do a STUN request then you will not be able to connect using the candidates that you would gather from the STUN information and you would end having to use TURNS anyway. |
Not when a TURN server is used, or do you refer to the fact that even with a TURN server each peer sends and receives one stream per peer, while HPB optimised this with ondemand/bundling features?
Ah, so the HPB handles streams/communication between multiple TURN servers. Then it makes sense to hide the option to add multiple TURN servers, to avoid errors due to misunderstandings, when some complex own backend would be required instead of the HPB to make that even work.
Sounds a bit too complex to me: The set is chosen randomly while the clients choose the TURN server within the set based on how they handle multiple passed TURN servers? Though theoretically, when clients choose based on reasonable metrics (supported protocol, ping) it could be an enhancement.
Ah, so this means that theoretically the Talk app could add support for STUNS and DTLS so that these option would have some use even that Chromium + Firefix do not support them 😉.
This means the only additional purpose for STUNS in combination with WebRTC would be if someone wants or needs an encrypted/authenticated STUN server request+answer while a direct P2P connection (no TURN) is still possible and done. Since it was already discussed that TURNS is not implemented to enhance security/privacy but for connectivity with strict firewalls, it's consequent to treat STUNS the same, which means that it doesn't add any desired benefit. This will be the same reason why browser did not add STUNS support and I then agree that it doesn't make sense to add it to NC Talk. |
|
Sorry @MichaIng, I missed your reply.
Actually I was ignoring the TURN server in that comment :-) I meant that when there is no HPB each participant connects directly to each other participant, and if there is an HPB then each participant connects directly to the HPB. The TURN server is just a relay, so when a participant needs it the TURN server sits between a participant and the other end, be it another participant or the HPB.
No, no, the TURN server is "transparent". From the point of view of the HPB it does not matter whether the participant is connected directly to the HPB or through a TURN server.
Yes, if you have a list of TURN servers in different ports and protocols you could also have another set of TURN servers with the same ports and protocols but on different machines, so you could have two set of servers with identical capabilities to balance between them.
Theoretically, yes. In practice, if I recall correctly they basically use the same WebRTC library as Chromium (but I have not checked).
👍 :-) |
|
Thanks for the clarification. I wonder why a client would require a TURN server to connect to the HPB, when it's publicly available. And it should be otherwise easily possible to incorporate the TURN server right into the HPB, to avoid relaying each clients through another machine on its way to the HPB. But that's another topic 🙂. |
The HPB needs to use a certain range of ports for WebRTC media connections (20000-40000 by default). A client could be behind a restrictive firewall that only allows connections to port 443, so even if the HPB is publicly accessible the client would need to connect to a TURN server in port 443, and the TURN server will then relay the packets to the 20000-4000 range in the HPB. |
|
@danxuliu what's the state with this? |
|
If I am not missing anything this pull request added support for stuns: and turns:, and also provided the full list of configured servers to the clients rather than a randome one. Support for turns: and providing the full list of configured STUN and TURN servers to the clients got added since this pull request was open (sorry again for the overlap). The support for stuns: could be problematic, as described at the beginning of #4715 (review) , so I would leave it out, at least for now. Regarding my suggestion to add sets of servers to randomly provide them to the clients rather than providing either a random one or the full list this has not been needed yet as far as I know, and it was not originally part of this pull request, so if it is eventually needed it can be done in a follow up. Due to all of the above I will close this pull request. Thanks plinss for your contribution! |
This allows administrators to add
turn:orturns:prefixes to the TURN server names in the configuration.For backward compatibility, servers without either prefix will still generate 'turn:' URLs, those with the scheme specified will preserve that scheme in the URL.
This allows specifying TURN+TLS or TURN+DTLS to allow traversal of restrictive firewalls. e.g.
turns:turn.example.com:443will generally be indistinguishable from https traffic.While I was there, I also added support for
stun:andstuns:prefixes on the STUN server settings, mostly for symmetry and consistency.I also removed the code that picks one of the configured TURN or STUN servers randomly. The code now sends all servers to the client in the specified order. This allows administrators to configure more efficient fallback behavior, e.g. specify unencrypted servers first, then fall back to encrypted connections when needed.
Fixes #257
FWIW, I have a TURN server that only allows TCP/UDP connections on port 3478 and TCP+TLS/UDP+DTLS on ports 443 and 5349 that I can make available for testing (as opposed to COTURN's normal behavior of allowing all protocols on all ports).