-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Inferred types _::Enum
#3444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Inferred types _::Enum
#3444
Conversation
|
I'm not necessarily against the RFC, but the motivation and the RFC's change seem completely separate. I don't understand how "people have to import too many things to make serious projects" leads to "and now |
In crates like use windows::{
core::*, Data::Xml::Dom::*, Win32::Foundation::*, Win32::System::Threading::*,
Win32::UI::WindowsAndMessaging::*,
}; |
|
Even assuming I agreed that's bad practice (which, I don't), it is not clear how that motivation has lead to this proposed change. |
How can I make this RFC more convincing? I am really new to this and seeing as you are a contributor I would like to ask for your help. |
|
First, I'm not actually on any team officially, so please don't take my comments with too much weight. That said:
Here's my question: Is your thinking that an expansion of inference will let people import less types, and then that would cause them to use glob imports less? Assuming yes, well this inference change wouldn't make me glob import less. I like the glob imports. I want to write it once and just "make the compiler stop bugging me" about something that frankly always feels unimportant. I know it's obviously not actually unimportant but it feels unimportant to stop and tell the compiler silly details over and over. Even if the user doesn't have to import as many types they still have to import all the functions, so if we're assuming that "too many imports" is the problem and that reducing the number below some unknown threshold will make people not use glob imports, I'm not sure this change reduces the number of imports below that magic threshold. Because for me the threshold can be as low as two items. If I'm adding a second item from the same module and I think I might ever want a third from the same place I'll just make it a glob. Is the problem with glob imports that they're not explicit enough about where things come from? Because if the type of I hope this isn't too harsh all at once, and I think more inference might be good, but I'm just not clear what your line of reasoning is about how the problem leads to this specific solution. |
Part of it yes, but, I sometimes get really frustrated that I keep having to specify types and that simple things like match statements require me to sepcigy the type every single time.
Its imported in the background. Although we don't need the exact path, the compiler knows and it can be listed in the rust doc.
Definitely not, you point out some great points and your constructive feedback is welcome. |
|
Personally |
|
I would like to suggest an alternative rigorous definition that satisfies the examples mentioned in the RFC (although not very intuitive imo): When one of the following expression forms (set A) is encountered as the top-level expression in the following positions (set B), the Set A:
Set B:
Set B only applies when the type of the expression at the position can be inferred without resolving the expression itself. Note that this definition explicitly states that Set B does not involve macros. Whether this works for macros like Set A is a pretty arbitrary list for things that typically seem to want the expected type. We aren't really inferring anything in set A, just blind expansion based on the inference from set B. These lists will need to be constantly maintained and updated when new expression types/positions appear. |
That is so useful! Let me fix it now. |
|
One interesting quirk to think about (although unlikely): fn foo<T: Default>(t: T) {}
foo(_::default())should this be allowed? we are not dealing with type inference here, but more like "trait inference". |
I think you would have to specify the type arg on this one because fn foo<T: Default>(t: T) {}
foo::<StructImplementingDefault>(_::default()) |
|
oh never mind, right, we don't really need to reference the trait directly either way. |
|
I've been putting off reading this RFC, and looking at the latest version, I can definitely feel like once the aesthetic arguments are put aside, the motivation isn't really there. And honestly, it's a bit weird to me to realise how relatively okay I am with glob imports in Rust, considering how I often despise them in other languages like JavaScript. The main reason for this is that basically all of the tools in the Rust ecosystem directly interface with compiler internals one way or another, even if by reimplementing parts of the compiler in the case of In the JS ecosystem, if you see a glob import, all hope is essentially lost. You can try and strip away all of the unreasonable ways of interfacing with names like eval but ultimately, unless you want to reimplement the module system yourself and do a lot of work, a person seeing a glob import knows as much as a machine reading it does. This isn't the case for Rust, and something like So really, this is an aesthetic argument. And honestly… I don't think that importing everything by glob, or by name, is really that big a deal, especially with adequate tooling. Even renaming things. Ultimately, I'm not super against this feature in principle. But I'm also not really sure if it's worth it. Rust's type inference is robust and I don't think it would run into technical issues, just… I don't really know if it's worth the effort. |
|
@clarfonthey glob imports easily have name collision when using multiple globs in the same module. And it is really common with names like |
I can understand your point, but, when using large libraries in conjunction, like @SOF3 said, it can be easy to run into name collisions. I use actix and seaorm and they often have simular type names. |
|
Right, I should probably clarify my position-- I think that not liking globs is valid, but I also think that using globs is more viable in Rust than in other languages. Meaning, it's both easier to use globs successfully, and also easier to just import everything you need successfully. Rebinding is a bit harder, but still doable. Since seeing how useful Even if you're specifically scoping various types to modules since they conflict, that's still just the first letter of the module, autocomplete, two colons, the first letter of the type, autocomplete. Which may be more to type than My main opinion here is that Like, I'm not convinced that this can't be better solved by improving APIs. Like, for example, you mentioned that types commonly in preludes for different crates used together often share names. I think that this is bad API design, personally, but maybe I'm just not getting it. |
|
I do think inferred types are useful when matching for brevity's sake: #[derive(Copy, Clone, Default, Eq, PartialEq, Ord, PartialOrd, Debug, Hash)]
pub struct Reg(pub Option<NonZeroU8>);
#[derive(Debug)]
pub struct Regs {
pub pc: u32,
pub regs: [u32; 31],
}
impl Regs {
pub fn reg(&self, reg: Reg) -> u32 {
reg.0.map_or(0, |reg| self.regs[reg.get() - 1])
}
pub fn set_reg(&mut self, reg: Reg, value: u32) {
if let Some(reg) = reg {
self.regs[reg.get() - 1] = value;
}
}
}
#[derive(Debug)]
pub struct Memory {
bytes: Box<[u8]>,
}
impl Memory {
pub fn read_bytes<const N: usize>(&self, mut addr: u32) -> [u8; N] {
let mut retval = [0u8; N];
for v in &mut retval {
*v = self.bytes[addr.try_into().unwrap()];
addr = addr.wrapping_add(1);
}
retval
}
pub fn write_bytes<const N: usize>(&mut self, mut addr: u32, bytes: [u8; N]) {
for v in bytes {
self.bytes[addr.try_into().unwrap()] = v;
addr = addr.wrapping_add(1);
}
}
}
pub fn run_one_insn(regs: &mut Regs, mem: &mut Memory) {
let insn = Insn::decode(u32::from_le_bytes(mem.read_bytes(regs.pc))).unwrap();
match insn {
_::RType(_ { rd, rs1, rs2, rest: _::Add }) => {
regs.set_reg(rd, regs.reg(rs1).wrapping_add(regs.reg(rs2)));
}
_::RType(_ { rd, rs1, rs2, rest: _::Sub }) => {
regs.set_reg(rd, regs.reg(rs1).wrapping_sub(regs.reg(rs2)));
}
_::RType(_ { rd, rs1, rs2, rest: _::Sll }) => {
regs.set_reg(rd, regs.reg(rs1).wrapping_shl(regs.reg(rs2)));
}
_::RType(_ { rd, rs1, rs2, rest: _::Slt }) => {
regs.set_reg(rd, ((regs.reg(rs1) as i32) < regs.reg(rs2) as i32) as u32);
}
_::RType(_ { rd, rs1, rs2, rest: _::Sltu }) => {
regs.set_reg(rd, (regs.reg(rs1) < regs.reg(rs2)) as u32);
}
// ...
_::IType(_ { rd, rs1, imm, rest: _::Jalr }) => {
let pc = regs.reg(rs1).wrapping_add(imm as u32) & !1;
regs.set_reg(rd, regs.pc.wrapping_add(4));
regs.pc = pc;
return;
}
_::IType(_ { rd, rs1, imm, rest: _::Lb }) => {
let [v] = mem.read_bytes(regs.reg(rs1).wrapping_add(imm as u32));
regs.set_reg(rd, v as i8 as u32);
}
_::IType(_ { rd, rs1, imm, rest: _::Lh }) => {
let v = mem.read_bytes(regs.reg(rs1).wrapping_add(imm as u32));
regs.set_reg(rd, i16::from_le_bytes(v) as u32);
}
_::IType(_ { rd, rs1, imm, rest: _::Lw }) => {
let v = mem.read_bytes(regs.reg(rs1).wrapping_add(imm as u32));
regs.set_reg(rd, u32::from_le_bytes(v));
}
// ...
}
regs.pc = regs.pc.wrapping_add(4);
}
pub enum Insn {
RType(RTypeInsn),
IType(ITypeInsn),
SType(STypeInsn),
BType(BTypeInsn),
UType(UTypeInsn),
JType(JTypeInsn),
}
impl Insn {
pub fn decode(v: u32) -> Option<Self> {
// ...
}
}
pub struct RTypeInsn {
pub rd: Reg,
pub rs1: Reg,
pub rs2: Reg,
pub rest: RTypeInsnRest,
}
pub enum RTypeInsnRest {
Add,
Sub,
Sll,
Slt,
Sltu,
Xor,
Srl,
Sra,
Or,
And,
}
pub struct ITypeInsn {
pub rd: Reg,
pub rs1: Reg,
pub imm: i16,
pub rest: ITypeInsnRest,
}
pub enum ITypeInsnRest {
Jalr,
Lb,
Lh,
Lw,
Lbu,
Lhu,
Addi,
Slti,
Sltiu,
Xori,
Ori,
Andi,
Slli,
Srli,
Srai,
Fence,
FenceTso,
Pause,
Ecall,
Ebreak,
}
// rest of enums ... |
|
I do like type inference for struct literals and enum variants. However, type inference for associated functions doesn't make sense to me. Given this example: fn expect_foo(_: Foo) {}
foo(_::bar());
All in all, it feels like this would add a lot of complexity and make the language less consistent and harder to learn. Footnotes
|
Colon and Underscore Syntax FlawsReadability is partially about how quickly a reader can understand code; reducing effort on the writer's side generally increases the effort on the reader's side, and rust is a language that's designed for maintability and sacrificing write-time (be that for run-time or for read-time).
Nothing represents the variants, the whole enum is 'brought into scope', which is in line with the way that
Firstly, I did admit that the I don't mean this in an antagonistic way, but could you really tell me if you saw: match fred {
(_::Orange(_ { size, firmness }), _)in an unfamiliar codebase, that it would be easier for you to understand than: match fred using (Fruits, b::C) { // qualifying:
(Orange(Orange { size, firmness }), _)The former tells me that there's an Moreover, if I want to find its definition, I'd know that
Though the compiler would likely be fine with either, as the reader, you understand what's going on much quicker with the latter syntax.
This is a good example, I don't particularly like that pattern, but if it is part of the language, then my previous point is rendered moot.
Thank you @kennytm, I wasn't aware: there are good points in there. Independently, I've also realised another problem with the colon syntax: it is a lie, i.e. it's misleading as to what is happening under-the-hood. Proposing regression to
|
The way I see it, readability is not about info-dumping as much as possible on the reader. Omitting information only hurts readability if the omitted information is required for understanding the code (the flow of the code - not every little decision the compiler is going to make when compiling it) My argument is that in cases handled by this feature1, this information is not required. I'll explain with the example you gave:
In both versions, I know that I'm handling the No version tells me what the The second version tells me that there is a struct named match fred using (Fruits, b::C) { // qualifying:
(Orange(orange), _) => { ... }Because then we could do
Footnotes
|
|
Ideally, I think the solution for this should work in all pattern contexts, not just match expressions. That would include things like The |
|
I don't think the if let Orange(orange) = fruit using Fruits {
...
}Is better than this: if let Fruits::Orange(orange) = fruit {
...
} |
|
I wrote this PR when I was much younger and more naive. Looking back, I realize there are definitely areas that need revision. One major concern I’ve been reflecting on is how we resolve the base type when using This is a pretty fundamental shift from how Rust usually operates. It creates a case where types are being used without a visible import path, which could affect both readability and tooling support. For example, it becomes harder to track type usage by grepping for its name or relying on IDE tooling, since One possible solution is to require that the type behind These are just some of my thoughts. I don’t have a fully formed solution yet, but I’d love to hear your input so I can revise the RFC accordingly. |
Rust can already do that - e.g. with |
|
Whoops! You’re right! Yet another reason why this change is not problematic! |
Doesn't this have the exact same problems as I also don't like the syntax, firstly I think |
|
I agree that its much better to infer match fred {
(_::Apple, _::Google) => { ... }
(_::Orange(x), _::Samsung) if x < 7 => { ... }
(_::Orange(_), x) if !matches!(x, _::Samsung) => { ... }
(_::Durian, _::Apple) => { ... }
_ => { ... }
} |
In addition to the mention below that we already support this in some cases: in general, this would be most useful when a function is already accepting or returning such a type, and that function is in scope (or being referenced by an explicit path). That means the type is indirectly referenced, and the type system is already willing to infer it as the type of something. For instance, if you write: let x = func();Then I can definitely see a few different cases for being able to write (The options below reference enums, specifically. This RFC also covers inferring struct types, but I think the variations are different for that, and I'll mention later in this comment how I think that ties in.) Option 1:
Option 2:
Option 3:
Option 4:
I think we should rule out option 2, because it loses most of the benefit of not having to write a I used to favor something closer to option 4, for the flexibility. But I'm now concerned about the breakage/fragility in the ecosystem, and about the potential ambiguity for a human reader about what type is being referenced. I would personally advocate for option 1: only allow the elision when it's made obvious by existing type inference. Function arguments, function return values, (non-generic) field of another struct, etc. This would still need careful specification, for cases like I think the same goes for structs, here. In theory there are analogous options for "infer a struct by its field types", but those seem even less reasonable than inferring an enum by its variant name. I think we should go with the equivalent option 1 there, too: only infer a struct when its type is already obvious from type inference. |
Why? The type inference rules Rust currently has already allow initializing something with inferred type and then passing it to a function: fn foo(bar: i32, baz: f32, qux: bool) {
dbg!(bar, baz, qux);
}
fn main() {
let bar = Default::default();
let baz = Default::default();
let qux = Default::default();
foo(bar, baz, qux);
}TBH I don't know why the other options are seriously considered. They are not type inference - they are just "scan every symbol available to the compiler and see if something fits". I mean, why not support |
|
This discussion reminds me of method resolution. Many of the tradeoffs seem roughly similar. E.g., the presented "Option 1" is spiritually similar to a method resolution rule that only considers "inherent" candidates. mod m {
pub trait Tr<T> { fn f(&self) {} }
}
fn g<T>(x: impl m::Tr<T>) {
x.f(); //~ Inherent candidate.
}"Option 2" (if I understand the proposal) is spiritually similar to a method resolution rule that would reject the above (since mod m {
pub trait Tr<T> { fn f(&self) {} }
impl<U> Tr<()> for U {} //~ This is the "one impl".
}
use m::Tr as _; //~ `Tr` is now in scope.
fn g(x: impl Sized) {
x.f(); //~ Extension candidate (also note "one impl rule").
}That we consider extension candidates is a source of (RFC 1105-allowed) breakage. That this can interact with the one impl rule (a.k.a. "1-impl rule"), as above, extends this further. It's an interesting counterfactual to consider what Rust would have looked like if we had prioritized avoiding this breakage and had only done method resolution based on inherent candidates (e.g., probably we would have added a way at the use site to make a trait's methods inherent candidates for a receiver of concrete type by explicitly asserting the type implements the trait). Maybe or maybe not there are lessons to draw from this for the design here. At least, it seems maybe worthwhile to compare what breakage we consider acceptable due to new trait methods and new trait impls with what breakage we might or might not consider acceptable for new enum variants, etc., and if we feel differently between these, why that is. |
|
@lemon-gith enum Fruits {
Orange,
Grape
}
fn main() {
match Fruits::Orange as Fruits {
Fruits::Orange => print!("orange"),
Fruits::Grape => print!("grape")
}
}it's not made for it but suits very well |
|
I am very surprised to see this discussion still being active. I thought this thread would have died down, but it looks like there are people who genuinely want to see this happen. I understand the desire to make Rust developers write less characters. I still personally hold the idea that this hurts readability, and also greppability. (how do I find all code that uses a particular variant on a particular enum (construct/deconstruction) when the variant name can clash with other enums? One may argue that we have enough LSP power to just use that, but I don't find the reliance on LSPs very pleasant) For the same reason, I would consider allowing things like With that, I still want to leave some practical considerations. This does not seem specifiable with only words. Lots of alternatives and potential directions are being given, but no specific implementation has been drafted (at least according to my knowledge). We cannot do language design on a feature this impactful without at least some experimental support. No amount of specification can predict implementation challenges, potential pitfalls, or cover all edge cases. And I don't think folks working on the compiler will like this proposal:
And this is not just me trying to convince yall to stop considering this proposal, though I just want to say that I think no meaningful progress can be made without work on the compiler, and it will be hard to get compiler work done. |
In the past, T-Compiler has weighed in on this discussion and has said that they don't really want to implement it in its current state:
You are the first in this thread to offer concrete direction on how the RFC could be improved. Given the implementation concerns you outlined, would it make sense to scope an initial experiment to function calls and match patterns only? If limited to that surface area, do you think T-Compiler would be open to considering it? |
I am not. I am telling you this instead: Because it is very unlikely you can find an experienced contributor to carry out the implementation work, there is no way for this RfC to proceed. Hence I believe spending a lot of effort in discussing the language design will not yield satisfactory results. Hence I recommend either dropping this proposal (to be clear, I don't think lang ever authorized an experiment here) or attract an experienced compiler folk to work towards your goal. Otherwise this cannot get any additional traction. |
Thanks for the response. I strictly do not agree that the discussion should stop at this point. Whether the proposal can progress should depend on the design and on implementation feasibility. I plan to continue refining the RFC and to investigate what a prototype would involve. If the concerns are about ambiguity or feasibility, I am prepared to address those. |
I'm not a compiler developer, so I could be totally off the mark, but is this really limiting the surface area? Function calls (I assume you mean the arguments?) are What's the difference between supporting If we want to suggest a limitation to the surface area, I think it would make more sense to pick one - either only support patterns for the experiment or only support expressions for the experiment. |
The main case I was trying to avoid is using #[derive(Debug, Default)]
struct Test {
pub test: u8,
}
pub fn main() {
let mut a = _ { test: 0 }; // Type is not known at this point.
println!("{:?}", a);
a = Test { test: 1 }; // Type becomes known here.
println!("{:?}", a);
}Handling this seems like a different level of complexity compared to _::Variant in match patterns or function arguments. |
precisely the reason why you need an experienced compiler contributor.... |
|
@JoshBashed I don't understand how that differs (for our purpose) from a function call. Consider this: pub fn main() {
let mut a = _ { test: 0 }; // Type is not known at this point.
let mut b = Default::default(); // Type is not known at this point.
println!("{a:?} {b:?}");
a = Test { test: 1 }; // Type becomes known here.
b = Test { test: 2 }; // Type becomes known here.
println!("{a:?} {b:?}");
}The compiler deduces the type of Now, consider: fn test(_: Test) {}
fn main() {
test(_ { test: 0 });
test(Default::default());
}Here, also, the compiler uses the same inference rules it used for In both cases we want the compiler, when it sees |
|
With regards to implementation the feature described in this RFC type Thing<T> = T;
struct Foo {}
fn main() {
let _foo: Foo = Thing::<_> {};
}The above code currently gives the following error: I reason implementing the above would be simpler as it is an existing syntax, I took a go at it some time ago and it was unexpectedly complicated, Also for context the following works on nightly already: #![feature(default_field_values)]
#![feature(type_alias_impl_trait)]
#![allow(unused)]
// -Znext-solver=globally is required
macro_rules! infer {
(_ $($tt:tt)+) => {
'block: {
type InferredType = impl ?Sized;
if false {
let fake_value: InferredType = loop {};
break 'block fake_value;
}
InferredType $($tt)+
}
}
}
struct Foo {
foo: u32 = 0,
bar: String,
}
enum Bar {
Huh,
}
fn main() {
let _foo: Foo = infer!(_ {
bar: "lol".to_string(),
..
});
let _bar: Bar = infer!(_::Huh);
}playground link: https://godbolt.org/z/sYY6WM8eb Which is kinda funny that that is allowed but inferred types aren't... So when TAIT and next-solver=globally are stabilized, Of course that isn't as useful and as nice as what is proposed in this RFC, |
While strictly true, having this feature in combination with default field values (already RFC-accepted) would permit this to be more ergonomic by not requiring an import or cluttering the code with something that is obvious (to the reader) in context. impl Foo {
pub fn new_with(params: Params) -> Self { /* … */ }
}
Foo::new_with(_ { alpha: 0, beta: "", .. })That's not to say that your concerns are unfounded; I absolutely agree this is not trivial. But I want to push back slightly against the assertion that it doesn't unlock anything new. It will enable new patterns to be established imo. |
If it does, then that's great. I haven't took a crack at it yet. I hope that will be how it works. |
That sure is interesting. rustc behaves quite buggily there IINM. If you swap the branches and substitute |
|
I've created a working prototype of the feature: https://github.com/JoshBashed/rust/tree/3444 @scottmcm suggested a preferred syntax on Discord, so I've implemented that version. If this approach works well, I'll update the RFC to reflect the new syntax. Please report any bugs you find!!! I need to fix issues early so we can move forward swiftly. |
|
Here is a test file i have created. #[derive(Debug, Copy, Clone)]
enum Fruits {
Apple,
Banana,
}
#[derive(Debug, Copy, Clone)]
enum Vegetables {
Carrot,
Potato,
}
#[derive(Debug, Copy, Clone)]
struct Salad {
fruit: Fruits,
vegetable: Vegetables,
}
#[derive(Debug, Copy, Clone)]
enum Food {
Fruits(Fruits),
Vegetables(Vegetables),
Salad {
salad: Salad,
}
}
#[derive(Debug, Copy, Clone)]
struct Point(i32, i32);
fn print_salad(s: &Salad) {
print!("salad with ");
match (s.fruit, s.vegetable) {
(.Apple, .Carrot) => println!("apple and carrot. not bad."),
(.Apple, .Potato) => println!("apple and potato. could be worse."),
(.Banana, .Carrot) => println!("banana and carrot. bananas don't belong in salads."),
(.Banana, .Potato) => println!("banana and potato. bananas still don't belong in salads."),
}
}
fn print_food(f: &Food) {
match f {
&.Fruits(fruit) => println!("fruit: {:?}", fruit),
&.Vegetables(vegetable) => println!("vegetable: {:?}", vegetable),
&.Salad { salad } => print_salad(&salad),
}
}
fn main() {
let s: Salad = .{
fruit: .Apple,
vegetable: .Carrot,
};
print_salad(&s);
let f: Food = .Salad { salad: s };
print_food(&f);
let apple: Food = .Fruits(.Apple);
print_food(&apple);
let vegetables: Food = .Vegetables(.Carrot);
print_food(&vegetables);
print_food(&.Salad { salad: .{ fruit: .Apple, vegetable: .Carrot } });
print_food(&.Fruits(.Apple));
print_food(&.Vegetables(.Carrot));
let p: Point = .(1, 2);
println!("{:?}", p);
} |

This RFC is all about allowing types to be inferred without any compromises. The syntax is as follows. For additional information, please read the bellow.
I think this is a much better and more concise syntax.
Rendered