There's an obvious extension here for lifetime inference - the example given doe...

Drakim · on Sept 20, 2014

In general I prefer the compiler to yell at me rather than to secretly help me behind the scenes. Otherwise, I don't learn anything and I get burned later on when there is a situation where the safeguard can't help me.

asuffield · on Sept 20, 2014

This appears to be a general objection to inference algorithms in languages? Personally I find type inference to be immensely valuable.

rwmj · on Sept 20, 2014

OCaml recently added a bunch of "help" -- for example, if you reference a field in a struct, but the name of the field could be from one of several structs, then it will guess which struct you mean.

TBH I don't find this to be that useful -- it covers up potential mistakes, and of course means that code breaks when compiled with the older version of OCaml which didn't do this.

Edit: I should note that this doesn't break type safety.

zem · on Sept 20, 2014

I like clang's general approach, where if it can guess what you mean to do, it says "did you mean ..." in the error message, and carries on trying to compile the rest of the code on an "assume we did make this change" basis, but fails overall.

hamstergene · on Sept 20, 2014

If the object has destructor with side effects, I believe that would massively complicate reasoning about program's behavior for human who's reading it.

fanf2 · on Sept 21, 2014

That is essentially what region inference in the MLKit compiler does. http://www.elsman.com/mlkit/

It has some unexpected properties: it can sometimes produce dangling pointers because it knows the target data will never be used, even though a GC would treat it as live. And on the other hand it will often promote data to an excessively long-lived region (because region lifetimes have to be nested). So current versions of the MLKit use GC as well as region inference.

riffraff · on Sept 20, 2014

wouldn't extending the lifetime in this situations lead to subtle hard to trace memory leaks?

keeperofdakeys · on Sept 20, 2014

No. Currently a reference in a struct or enum has no inferred lifetime, and must be explicitly stated. Having it default to the lifetime of that containing struct or enum would simply mean you don't need to specify it. The danger is it could infer wrongly, leading to lifetime errors that might be cryptic.

But at the end of the day, the Rust compiler will never allow a reference to outlive the original object.

dbaupp · on Sept 20, 2014

You've misinterpreted, it's not a question about reducing the annotation in a struct/enum definition, but about postponing the destruction of the String so that the later references are valid, i.e. currently we have

  fn test_parse_unsafe() {
      let v = {
          let text = "The cat".to_string();
          tokenize_string3(text.as_slice())
      }; // `text` destroyed here
      assert_eq!(vec![Word("The"), Other(" "), Word("cat")], v);
  }

but the suggestion/question is about changing this to

  fn test_parse_unsafe() {
      let v = {
          let text = "The cat".to_string();
          tokenize_string3(text.as_slice())
      }; 
      assert_eq!(vec![Word("The"), Other(" "), Word("cat")], v);
  } // `text` destroyed here

so that the references in `v` are valid.

This could lead to "memory leaks", where a destructor is implicitly postponed to a higher scope, but I don't think it would be much of a problem in practice (the promotion would only be through simple scopes, not through loops, and maybe not through `if`s). In fact, there's an yet-to-be-implemented accepted RFC covering this[1] (there's no guarantee that it will be implemented though, just that the idea is mostly sound).

[1]: https://github.com/rust-lang/rfcs/blob/master/active/0031-be...

dllthomas · on Sept 21, 2014

I like this - you can make sure something isn't named in subsequent code while still allowing it to be used. Ideally there could be some mechanism to statically assert that it is destroyed by some particular point, though I'm not sure what that should look like (I guess that's just making the lifetime explicit?).

asuffield · on Sept 20, 2014

Ah yes, that RFC is roughly what I had in mind, thanks.

I believe that it's safe to promote through an if, although obviously not through a general loop.

dbaupp · on Sept 20, 2014

Yes, I agree that it should be safe, I've softened my original text. However, it would require dynamically tracking if the destructor needs to be run, and there's currently discussion[1] about Rust possibly moving to a static model, for the highest performance.

[1]: https://github.com/rust-lang/rfcs/pull/210

asuffield · on Sept 20, 2014

Consider this:

On a two-way if statement, then a given storage location is either set on zero, one or both branches. If it is set on neither branch then the if statement is irrelevant and can be ignored. If it is set on one branch, then either it had an original value and hence can be treated as being set on both branches, or it must be destroyed within the branch of the if (no null pointers - think about it until it is clear that the type system guarantees this). Hence we are only interested in cases which are isomorphic with the location being set on both branches.

We can treat this as a phi node following the if: there is one output value, which has been created in one of two different ways. In this case we don't know statically which value has been constructed, but we do know statically how and when to destroy it regardless of which one we get, because both branches have the same type and storage location. We don't actually need to know where it came from.

Any obvious problems? I think it works...

dbaupp · on Sept 20, 2014

It doesn't work, the &str could come from completely different types in the two branches.

I.e. one branch could be created by .as_slice() on a String, the other could be created by referencing a global, e.g.

  let s = if cond {
      let some_string = create_it();
      some_string.as_slice()
  } else {
      "literals are always-valid &str's"
  };

`some_string`s destructor should only be run if the first branch was taken.

asuffield · on Sept 20, 2014

I'm not sure that this particular example can ever use some_string outside the if without hitting a type error, but I see what you mean.

That seems like a reasonable case to raise a type error. That defines the cases quite neatly: if it's temporary on both branches then it can work, and if it has different lifetimes then the values can't be merged and should be rejected. If the programmer really meant for this to work then they need to copy the global, and copies should be written explicitly.

dbaupp · on Sept 20, 2014

There's no type error at all, we're talking about delaying the destruction of some_string so that the `s` (which is a &str) is valid outside the if. The string literal is a &str with a infinite lifetime, and so can of course be safely restricted to have the same lifetime as the other branch (done implicitly).

However, it's easily possible to have the &str come from temporaries of different types in the two branches. This would restrict the static destruction case to only working through an `if` when the "parent" values have exactly the same types; which doesnt seem nearly as valuable and possibly not worth the effort.

TheLoneWolfling · on Sept 20, 2014

Couldn't the compiler get around that by introducing a boolean variable that is set depending on the branch of the if statement taken, that it checks before running the destructor?

Although this starts getting really messy.

dbaupp · on Sept 20, 2014

Yes, that's exactly how it would be handled, but it's then dynamic destruction, not static.

TheLoneWolfling · on Sept 21, 2014

Can you elaborate? I fail to see why this is dynamic - it still determines at compile time if/when to run destructors. I was under the impression that dynamic destruction was when you determine when to run destructors at runtime. Garbage collectors or reference counting, in other words.

dbaupp · on Sept 21, 2014

It is dynamic because it is not known if the destructor call is executed at compile time. I'm using the terminology from RFC PR #210 that I linked above.

TheLoneWolfling · on Sept 22, 2014

Oh, ok.

On a related note, the approach I mentioned is actually mentioned in that RFC PR:

> Store drop-flags for fragments of state on stack out-of-band

dbaupp · on Sept 22, 2014

Yes, correct, that's what "dynamic destruction" is referring to.

asuffield · on Sept 20, 2014

The obviously "safe" thing to do is to push it out as far as the containing function, and no further. That would be sufficient for this scenario and probably all the ones where you would ever want this to happen.