Skip to content

Conversation

@vasco-santos
Copy link
Member

@vasco-santos vasco-santos commented Sep 10, 2020

This PR is a follow up of the main implementation of auto relay to inform the connected peers of the new addresses to announce, by sending the new signed peer record via identify-push.

The self peer record update logic was added to create the new record, when the multiaddr changes. Ping unmount needed to be modified to before the transports close, otherwise when the ping unhandle protocol is triggered, the identify-push will try to access the addresses when they do not exist, causing unhandled exceptions, such as https://github.com/libp2p/js-libp2p-tcp/blob/v0.15.1/src/listener.js#L97

When we support dynamic transport updates, we should automate the self peer record update based on events for updated addresses. This will enable us to not verify if the addresses are outdated each time we want to send the self peer record envelope.

Needs:

@vasco-santos vasco-santos force-pushed the chore/auto-relay-multiaddr-update-push branch from 970d680 to 9ec945d Compare September 15, 2020 14:08
src/index.js Outdated
await this.peerStore.stop()
await this.connectionManager.stop()

ping.unmount(this)
Copy link
Member Author

@vasco-santos vasco-santos Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping unhandle protocol is triggered, the identify-push will try to access the addresses when they do not exist, so this needs to happen before the transportManager.close due to the libp2p.multiaddrs getter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a bug. After the TransportManager is closed, we should have no active listeners, which should just yield an empty array of addresses.

This approach is also not good because we should not be doing an identify push when we are shutting down and this change is going to cause that to happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do await this.transportManager.close() we will stop the transport listeners. With this, we have an error on getAddrs(), at list from libp2p-tcp: https://github.com/libp2p/js-libp2p-tcp/blob/master/src/listener.js#L96

So, when we do ping.unmount, it will call the unhandle which will try to do the identify-push and use the libp2p.multiaddrs getter, which uses the transport getAddrs, which fail as noted above.

Or we unmount before, which in my opinion is correct as we are also unmounting the other protocols before, or we need to change the transportManager.getAddrs() to catch the transport.getAddrs error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is also not good because we should not be doing an identify push when we are shutting down and this change is going to cause that to happen.

We are doing that for all other protocols. I think we should go with a more central fix for that. Before identify push, we should check that libp2p is not being stopped. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a check for if (this.isStarted() && this.identifyService) {. We can have a ._isStopping property, or just turn false for isStarted before stopping the protocols

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this, we have an error on getAddrs(), at list from libp2p-tcp: https://github.com/libp2p/js-libp2p-tcp/blob/master/src/listener.js#L96

close sets all the listeners to empty, https://github.com/libp2p/js-libp2p/blob/v0.29.0/src/transport-manager.js#L72, and then getAddrs iterates over that list, which should immediately exit. The transport should never get called because it should already be cleaned up. I think there is a teardown race here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I already understood the issue. In the test that causes this error we do:

await libp2p.stop()
await libp2p.start()

When we do the libp2p stop, the ping.unmount() is not async. So the node will stop and going to restart while at the same time ping.unmount will call the unhandle while tcp server.listen has already kicked in from start.

The best solution in my opinion is to add the _isStopping property. More than fixing this issue, it will block the identify-push from pubsub.stop, dht.stop from being triggered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I just changed the the._isStarted = false for the beginning of stop. It seems not problematic to do this, as even if the stop code throws it would be assigned to false either way. We might get in a weird state if an error happens, but it would be problematic with the previous logic as well. A restart will fix the weird state

@vasco-santos vasco-santos force-pushed the chore/auto-relay-multiaddr-update-push branch from 37cec63 to ec47310 Compare September 17, 2020 07:56
@vasco-santos vasco-santos marked this pull request as ready for review September 17, 2020 07:57
src/index.js Outdated
await this.peerStore.stop()
await this.connectionManager.stop()

ping.unmount(this)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a bug. After the TransportManager is closed, we should have no active listeners, which should just yield an empty array of addresses.

This approach is also not good because we should not be doing an identify push when we are shutting down and this change is going to cause that to happen.

* Creates or updates the self peer record if it exists and is outdated.
* @return {Promise<void>}
*/
async _createOrUpdateSelfPeerRecord () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method addition seems unnecessary? This just makes _getSelfPeerRecord pretty much useless.

Also, we should not be doing update checks in here, we should just update the record when an actual change happens. This is going to cause a lot of unnecessary address checks.

// TODO push announce multiaddrs update
// libp2p.identifyService.pushToPeerStore()
// Announce multiaddrs update on listen success
libp2p.identifyService.pushToPeerStore()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being called in a couple places, I think we shouldn't be doing this here.

Perhaps instead we should be updating our addresses in the PeerStore? We don't currently store them there, but it has an event system for changes, which we could just listen to for triggering a self record update and identify push. This enables the identify system to subscribe to the changes and manage everything itself, instead of other systems like this one needing to know these specifics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, but it will be tricky.

To enable this, we need to be able to track transportManager getAddrs() changes. With this, once these addresses change, who is responsible to tracking this (perhaps the transportManager while it tracks the transports addrs) needs to update the addressBook for self addresses (with addressBook.set or addressBook.add).

This will cause the event to be triggered as you mentioned, but it will also cause an issue. So, the identifyService would listen on address changes for the self peer id and it would create the new self signed peer record, storing it in the peerStore. But, using addressBook.consumePeerRecord() will currently trigger a new address:change event and get in a loop.

Regarding the second point, we can use Book _setData emit option and put it false if we had the same multiaddrs as before. But, I think we are complicating this a lot.
I would go with when a self address change, we just create the new peer record and do addressBook.consumePeerRecord(), which triggers the event that identify would listen on and Identify does the Push. We can do the same to add own protocol to the protoBook with add on handle and remove on unhandle. With this approach, we would be removing the self peer record creation from Identify. We can go back the my first solution using a Record Manager, or we can figure out a way of making this part of the Address Manager.

I would like to have in the future to support dynamic address announce/noAnnounce changes via an API. So improving the AddressManager to both be responsible for listening for addresses change in the TransportManager, or to the announce/noAnnounce and create/store the self signed peer record would be a good solution. The first point will be difficult to change though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's table this for now, it's not a huge deal since it's all internal and I don't think we need to conflate this prematurely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think we need to solve this to avoid the unnecessary change checks in _getSelfPeerRecord which is going to run on every identify request.

Copy link
Member Author

@vasco-santos vasco-santos Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, checking if is different anytime is not a good call. So this is my proposal:

  • transportManager will listen for transport listening events.
  • when it triggers, transportManager will emit a listening event with getAddrs() result. It might be also useful for application layer, for example ipfs could use this for the daemon stdout of addresses listening on.
  • AddressManager/IdentifyService listen for this event and is responsible for creating the self peer record and add it to the AddressBook. Self addresses will also be updated in the AddressBook with this.

For not complicating too much now, I think we can keep this on the IdentifyService and move it to the Address Manager when we support dynamic API changes. Both need start and stop methods for the transportManager event listener. The event listener should only be setup after all the initial listeners so that we do not create signed peer records per listener in startup. This way, we will only track updates and keep the current behaviour of creating if it does not exist in the first identify call. If we went with the AddressManager, we would need to have extra logic for this initial creation, or the identifyService would need to call a addressManager.createSelfPeerRecord if there was none in the peerStore.

@vasco-santos vasco-santos force-pushed the chore/auto-relay-multiaddr-update-push branch 2 times, most recently from 596b90e to 9daa067 Compare September 23, 2020 11:22
// TODO push announce multiaddrs update
// libp2p.identifyService.pushToPeerStore()
// Announce listen addresses change
listener.emit('listening')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of using the listening event here. But, the other option is to have a new event for the not listening or trigger a manual update of the peerRecord/addressBook self. I was trying to be compliant with the interface-transport.
What do you think @jacobheun ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I can use the close event from the listeners and have a transportManager.emit('change'). But, circuit would need to emit close here and transportManager would track close and listening

@vasco-santos vasco-santos force-pushed the chore/auto-relay-multiaddr-update-push branch from 9daa067 to 2b6c231 Compare September 23, 2020 11:28
@vasco-santos
Copy link
Member Author

vasco-santos commented Sep 23, 2020

@jacobheun
I added two commits two fix the above review

  • assign ._isStarted to false earlier on stop
    • it would be false either way, if any error is thrown
    • it avoids doing identify-push for pubsub+dht that was happening before
    • it does not cause a race condition on ping unmount + fast node.stop() and node.start()
  • use listening transport events to create self peer record on updates
    • there is one concern on removing listen addresses commented on a review comment

As a side note, I will add in a separate PR a ProtoBook event listener for identify-push. Instead of relying on calling pushToPeerStore on .handle and .unhandle, I will add/remove the protocols from the ProtoBook and Identify will listen on these changes

this._listeners.get(key).push(listener)

// Track listen events
listener.on('listening', () => this.emit('listening'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the TransportManager needs to emit events.

If we watch the listening and close events for each transport listener, instead of bubbling those events we can have the TransportManager itself create and seal the updated Peer Record. The TransportManager should always know when a listeners status changes, so pushing updates from here makes sense.

  1. A transport begins/stops listening on an address
  2. TransportManager sees the change and generates a new peer record and stores it the address book
  3. Storing the new record triggers the multiaddr changes
  4. Identify sees the multiaddr change event from the peer store for self and sends out a push

If we then store our protocols in the peer store instead of the upgrader, the identify service can also subscribe to those which keeps the api consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern about that approach is that the listen addresses from the transports are from a layer below. The announce addresses for the record will be affected by the Address Manager content. So, it is odd to me that the transport layer will be getting libp2p.multiaddrs, which gets from the Transport Manager itself and the Address Manager. Moreover, supporting dynamic announce multiaddr changes will mean the Transport Manager will need to listen on the changes there.

I would go with:

  1. A transport begins/stops listening on an address
  2. TransportManager sees the change and emits a change event
  3. AddressManager sees the listen change and generate the self peer record (libp2p.multiaddrs code would move into the AddressManager and libp2p multiaddrs getter would be a bridge to that)
  4. Storing the new record triggers the multiaddr changes
  5. Identify sees the multiaddr change event from the peer store for self and sends out a push

Your solution is a bit simpler in this moment, but I think it makes more sense to be the AddressManager to handle the record as the record contains the address to announce and not the listen ones. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we then store our protocols in the peer store instead of the upgrader, the identify service can also subscribe to those which keeps the api consistent.

That I totally agree, and I will follow up a PR to store the protocols in protocol and subscribe to changes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage to internalizing the address change to the transport manager is that it doesn't affect the API on any system at this point so making that change is low risk. I think the Address Manager owning this makes sense, but I'd like to see a design before we change those APIs, to avoid needing to change them in the future.

The Address Manager should also own our observed addresses, tracking them and scoring them accordingly. How should observed addresses flow into the Address Manager? Does it affect the API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I went with your suggestion for now. I was not expecting to break the API adding this to the AddressManager as it would be basically the same as now in the TransportManager + TransportManager being an event emitter.

Regarding the observed addresses I agree. So, I will start for creating an issue and we can discuss this when it is relevant

@vasco-santos vasco-santos force-pushed the chore/auto-relay-multiaddr-update-push branch from bb716ee to c0a01df Compare September 23, 2020 17:05
const envelope = await Envelope.seal(peerRecord, libp2p.peerId)
libp2p.peerStore.addressBook.consumePeerRecord(envelope)
} catch (_) {}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to a utility file?

expect(tm._listeners.get(Transport.prototype[Symbol.toStringTag])).to.have.length(addrs.length)

// Created Self Peer record on new listen address
expect(tm._createSelfPeerRecord.callCount).to.equal(addrs.length)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to validate the record is being stored in the address book. Validating the correct record is being stored would also be useful.

const envelope = await Envelope.seal(peerRecord, this.libp2p.peerId)
this.libp2p.peerStore.addressBook.consumePeerRecord(envelope)

return this.libp2p.peerStore.addressBook.getRawEnvelope(this.libp2p.peerId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to return anything, nothing is consuming the result of this, so we should always return null.

nit: Calling this function _updateSelfPeerRecord might be a bit clearer about what it's doing.

@vasco-santos vasco-santos force-pushed the chore/auto-relay-multiaddr-update-push branch 3 times, most recently from 6affff3 to 152a048 Compare September 24, 2020 11:09
Copy link
Contributor

@jacobheun jacobheun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just one small thing in the tests, feel free to merge when ready.

const connection = await libp2p.dialer.connectToPeer(remoteAddr)
expect(connection).to.exist()
// Wait for nextTick to trigger the identify call
await delay(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? There is a wait immediately after this for identify.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was a debugging remaining, just removing it

@vasco-santos vasco-santos force-pushed the chore/auto-relay-multiaddr-update-push branch 2 times, most recently from 2f6358d to dfb6ba3 Compare September 25, 2020 14:33
@vasco-santos vasco-santos force-pushed the chore/auto-relay-multiaddr-update-push branch from dfb6ba3 to 7ef8044 Compare September 25, 2020 14:35
@vasco-santos vasco-santos merged commit 8f40eb6 into 0.30.x Sep 25, 2020
@vasco-santos vasco-santos deleted the chore/auto-relay-multiaddr-update-push branch September 25, 2020 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants