Skip to content

Conversation

@dvdplm
Copy link
Contributor

@dvdplm dvdplm commented Dec 1, 2020

This PR attempts to solve #7 by filtering out all occurrences of PhantomData and all type parameters contained by the markers.

EDIT: nested tuples now work too.

It is "an attempt" because it ultimately fails to achieve full removal of the markers. In particular, nested tuples are not handled at all, so

struct A<T> { 
      a: (bool, PhantomData<T>)
}

…works and generates type info equivalent to struct A { a: (bool) }, but

struct A<T> { 
    a: (bool, (u8, PhantomData<T>))
}

…does not work as expected (the nested tuple is removed completely).

I'm leaving this here for future consideration. Maybe the actual use-cases are limited enough that nested tuples aren't ever going to be a problem. Or maybe we need a different approach entirely.

@Robbepop
Copy link
Contributor

Robbepop commented Dec 1, 2020

Thanks for your prototype.

Or maybe we need a different approach entirely.

However, I think we really do need a differerent approach if this approach cannot fix this bug.
I haven't yet taken a proper look into the code though so it might be still possible to be fixed.

You approached the biggest limitation of proc. macros in that they cannot access semantic information of the underlying code but just syntactical information which is why you try to filter PhantomData by querying its name instead of filtering by its true type. However, this string-based filtering has significant downsides such as it really might not reflect the real types and therefore might led to inconveniences, bugs or really confusing behavior. What we often try to do is to somehow make the Rust compiler do what we'd like to do ourselves (e.g. in this case the filtering.). This is oftentimes extremely tricky and involves hacks if it is solvable at all - and even then it takes a good deal of creativity and is really mostly just a hack that somehow works.

@dvdplm
Copy link
Contributor Author

dvdplm commented Dec 4, 2020

However, I think we really do need a differerent approach if this approach cannot fix this bug.
I haven't yet taken a proper look into the code though so it might be still possible to be fixed.

@Robbepop fixed this problem. There might be others ofc, but ptal at your convenience. :)

@dvdplm dvdplm marked this pull request as ready for review December 7, 2020 12:57
@dvdplm dvdplm requested review from Robbepop and ascjones December 7, 2020 12:58
@Robbepop
Copy link
Contributor

I'd like to investigate another approach that does not rely on syntactical filtering as in this PR but rather on some way using the Rust type system before we eventually merge this.

@Robbepop
Copy link
Contributor

Robbepop commented Jan 5, 2021

I have had a conversation with @ascjones yesterday where we talked about this PR.
The idea is that in some cases we actually might want to conserve the phantom data and its associated generic type parameter if we want to regenerate Rust code using the metadata.
Therefore another design for this could be to add another variant kind to TypeDef: TypeDef::Phantom(TypeDefPhantom) with which we could represent PhantomData types that shall be conserved while at the same time it is clear that those type won't matter for actual encoding and decoding.
Since we want to both allow and disallow PhantomData fields we need a way for the user to clearly specify this.
We could do this with some scale_info specific marker attributes such as:

#[derive(scale_info::TypeInfo)]
pub struct Vec3<T> {
    x: i32,
    y: i32,
    z: i32,
    #[scale_info(ignore)]
    a: PhantomData<T>,
}

However, before we go into design of these proc. macro attributes we first need to figure out what is an appropriate default for this scenario:

  1. We could allow all fields by default and explicitly force the user to ignore some fields (even non-PhantomData fields) using `#[scale_info(ignore)].
  2. We only provide this functionality for PhantomData typed fields and turn the default around since it is probably more common that you actually want to ignore a PhantomData field than not using #[scale_info(explicit_phantom)] (or some better name) if you really are interested in the PhantomData field metadata information.

I am actually leaning towards the 1. suggestion because it looks cleaner to me all in all even if it is kinda of a slight mismatch with the default behavior that I am expecting in the common case with PhantomData types.

@dvdplm
Copy link
Contributor Author

dvdplm commented Jan 6, 2021

Since we want to both allow and disallow PhantomData fields we need a way for the user to clearly specify this.
We could do this with some scale_info specific marker attributes such as:

@Robbepop One problem with adding a #[scale_info(ignore)] attribute is that there is already the #[codec(skip)] attribute in parity-scale-codec, it's used today in substrate to skip PhantomData items, so it's the same use case. Requiring users to specify both attributes seems a bit silly and if users add one but not the other they'd run into trouble.
On the other hand, relying on #[codec(skip)] means scale-info is tightly coupled to parity-scale-codec and begs the question if the two should really be separate code bases. After talking to @ascjones I think the decision to stay coupled to SCALE and parity-scale-codec has already been made, but I still wanted to state it one more time (let me know if you disagree!): scale-info is tightly coupled to parity-scale-codec and while we will make a best effort attempt to keep things general, when we run into situations where it's more convenient to stay tightly coupled we will do so.

(The decision to merge the two code bases or not seems non-urgent to me; I suggest we not worry about it for now.)

@Robbepop
Copy link
Contributor

Robbepop commented Jan 6, 2021

Since we want to both allow and disallow PhantomData fields we need a way for the user to clearly specify this.
We could do this with some scale_info specific marker attributes such as:

@Robbepop One problem with adding a #[scale_info(ignore)] attribute is that there is already the #[codec(skip)] attribute in parity-scale-codec, it's used today in substrate to skip PhantomData items, so it's the same use case. Requiring users to specify both attributes seems a bit silly and if users add one but not the other they'd run into trouble.
On the other hand, relying on #[codec(skip)] means scale-info is tightly coupled to parity-scale-codec and begs the question if the two should really be separate code bases. After talking to @ascjones I think the decision to stay coupled to SCALE and parity-scale-codec has already been made, but I still wanted to state it one more time (let me know if you disagree!): scale-info is tightly coupled to parity-scale-codec and while we will make a best effort attempt to keep things general, when we run into situations where it's more convenient to stay tightly coupled we will do so.

(The decision to merge the two code bases or not seems non-urgent to me; I suggest we not worry about it for now.)

parity-scale-codec and scale-info are coupled by their design since scale-info's job is to represent the encoding and decoding of SCALE encodable types. Therefore I actually like the idea to utilize already existing proc. macro attributes of parity-scale-codec for scale-info.

@dvdplm
Copy link
Contributor Author

dvdplm commented Jan 18, 2021

Closing in favor of #48

@dvdplm dvdplm closed this Jan 18, 2021
@dvdplm dvdplm deleted the dp-remove-phantomdata branch January 18, 2021 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants