-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft: caller unsafe #330
base: main
Are you sure you want to change the base?
Conversation
|
||
* Clearly annotate which methods require extra verification to use safely | ||
* Clearly identify in source code which methods are responsible for performing extra verification | ||
* Provide an easy way to identify in source code if a project doesn't use unsafe code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about for a binary? Should we set UnverifiableCodeAttribute
or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. In this design it’s allowed to define a RequiresUnsafe method without an unsafe block if you never actually call the method. If you did, there would be an unsafe block and the assembly would have unverifiable code. I’m not sure if defining such a method, but never using it except through other callers unsafe methods, should be unverifiable.
My inclination is yes: anything that contains unsafe code should be marked, even if it’s not called from safe code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentence says project. That should be very simple to achieve: does the project have AllowUnsafeBlocks
or not? That is the barrier today for allowing unsafe
usage and that should not change. If the desire is to detect in the binary of we had this attribute or not then yes we'd need to add more metadata.
|
||
## Proposal | ||
|
||
We need to be able to annotate code as unsafe, even if it doesn't use pointers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the critical part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming the rules are going to be the same for everyone (BCL / SDKs / everybody else), wouldn't we need the opposite as well?
We need to be able to annotate code as safe, even if it uses pointers.
Otherwise it seems to me that this will become viral (similar to the trimming attributes), where a lot of APIs are considered unsafe, because they eventually do something unsafe in some code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s what unsafe does today. It lets you call unsafe things, but has no requirements on the callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit lost here.
Does this mean that to produce "safe" packages I'll have to get rid of all the code the uses Unsafe.As, CollectionsMarshal and others?
How can I "guaranty" to others that my implementation IS safe (even if I use a ImmutableCollectionsLarshal.AsImmutableArray
in a single place in my code base)?
Will all my packages be categorized as "unsafe" because in one low-level assembly I do use a Unsafe.As
?
Also, I fully agree with @rolfbjarne: FieldOffset are unsafe...
So the question is: what code base remains on the "safe" side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s what unsafe does today. It lets you call unsafe things, but has no requirements on the callers.
Ah, I (think I) see, this new attribute would need to be added manually, it won't be added automatically whenever someone uses unsafe code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit lost here. Does this mean that to produce "safe" packages I'll have to get rid of all the code the uses Unsafe.As, CollectionsMarshal and others? How can I "guaranty" to others that my implementation IS safe (even if I use a
ImmutableCollectionsLarshal.AsImmutableArray
in a single place in my code base)? Will all my packages be categorized as "unsafe" because in one low-level assembly I do use aUnsafe.As
? Also, I fully agree with @rolfbjarne: FieldOffset are unsafe... So the question is: what code base remains on the "safe" side?
Does this doc mention marking packages as unsafe? I don't see it. It's perfectly fine to have unsafe code, it just needs to be recognized as such at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@olivier-spinelli -- Fair question. As @EgorBo says, that's not what is being communicated. In terms of helping people understand what we're thinking, this was probably a poor doc to go first. Expect some higher-level docs that better explain a plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Propagating unsafe
warnings too much would be very disappointing and cause developers disabling the new unsafe
mechanism at all. We should take acceptance into consideration.
I think the key point is to make the customers aware of unsafety at use site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a widely-seen formalism to disallow unsafe
syntax totally, instead using the Marshal
family which is inefficient and typically more unsafe. We should provide a path for moving people away from that.
proposed/caller-unsafe.md
Outdated
Some examples of APIs or features that are unsafe due to exposing uninitialized memory include: | ||
|
||
* ArrayPool.Rent | ||
* The `stackalloc` C# feature used with `SkipLocalsInit` and no initializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* The `stackalloc` C# feature used with `SkipLocalsInit` and no initializer | |
* The `stackalloc` C# feature with no initializer |
SkipLocalsInit
is irrelevant as stackalloc
s are not guaranteed to be initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not true. we consider any non-zero-initialized stackalloc as a bug if no SkipLocalsInit is set. We might want to make an adjustment to the ECMA with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what the c# standard specifies, it doesn't guarantee you get localloc
-like behaviour which has the required guarantees. If we updated the c# spec, that would work too. Or if roslyn explicitly documented their stackalloc
's behaviour, like they do with bool
values, that would work too.
proposed/caller-unsafe.md
Outdated
|
||
* Memory safety | ||
* No access to uninitialized memory | ||
|
||
In this document **memory safety** is strictly defined as: safe code can never acquire a reference to memory that is not managed by the application. "Managed" here does not refer to solely to heap-allocated, garbage collected memory, but also includes stack-allocated variables that are considered allocated by the runtime. | ||
In this document **memory safety** is strictly defined as: code can never acquire a reference to memory that is not managed by the application. "Managed" here does not refer to solely to heap-allocated, garbage collected memory, but also includes stack-allocated variables that are considered allocated by the runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this document **memory safety** is strictly defined as: code can never acquire a reference to memory that is not managed by the application. "Managed" here does not refer to solely to heap-allocated, garbage collected memory, but also includes stack-allocated variables that are considered allocated by the runtime. | |
In this document **memory safety** is strictly defined as: code can never acquire a reference to memory that is not managed by the application. "Managed" here does not refer solely to heap-allocated, garbage collected memory, but also includes stack-allocated variables that are considered allocated by the runtime. |
|
||
Notably, unsafe did not change the requirement that the code in the block must be correct. It merely offset the responsibility from the language and the runtime to the user in verification. | ||
|
||
For more precise details on the error semantics of unsafe blocks and unsafe members, the rules will mirror the rules defined for "Requires" attributes defined in [Feature attribute semantics](https://github.com/dotnet/runtime/blob/main/docs/design/tools/illink/feature-attribute-semantics.md#requiresfeatureattribute). The only addition is the presence of the `unsafe` block, which effectively provides a local `Requires` context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behavior of "Requires" attributes differs between classes and structs. Is it desirable to match that in this C# language feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we anticipate any future IL analysis here? That’s one of the things that makes structs tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this as a C# language feature. I do not think it makes sense to think about IL analysis in the context of C# language feature.
Co-authored-by: Jan Kotas <[email protected]>
Co-authored-by: Jan Kotas <[email protected]>
|
||
## Background | ||
|
||
C# has had the `unsafe` feature since 1.0. There are two different syntaxes for the feature: a block syntax that can be used inside methods and a modifier that can appear on members and types. The original semantics only concern pointers. An error is produced if a variable of pointer type appears outside an unsafe context. For the block syntax, this is anywhere inside the block; for members this is inside the member; for types this is anywhere inside the type. Pointer operations are not fully validated by the type system, so this feature is useful at identifying areas of code needing more validation. Unsafe has subsequently been augmented to also turn off lifetime checking for ref-like variables, but the fundamental semantics are unchanged -- the `unsafe` context serves only to avoid an error that would otherwise occur. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for members this is inside the member
could be good to clarify that if unsafe
is on the member, it has an effect on the member's signature as well
|
||
C# has had the `unsafe` feature since 1.0. There are two different syntaxes for the feature: a block syntax that can be used inside methods and a modifier that can appear on members and types. The original semantics only concern pointers. An error is produced if a variable of pointer type appears outside an unsafe context. For the block syntax, this is anywhere inside the block; for members this is inside the member; for types this is anywhere inside the type. Pointer operations are not fully validated by the type system, so this feature is useful at identifying areas of code needing more validation. Unsafe has subsequently been augmented to also turn off lifetime checking for ref-like variables, but the fundamental semantics are unchanged -- the `unsafe` context serves only to avoid an error that would otherwise occur. | ||
|
||
While existing `unsafe` is useful, it is limited by only applying to pointers and ref-like lifetimes. Many methods may be considered unsafe, but the unsafety may not be related to pointers or ref-like lifetimes. For example, almost all methods in the System.RuntimeServices.CompilerServices.Unsafe class has the same safety issues as pointers, but do not require an `unsafe` block. The same is true of the System.RuntimeServices.CompilerServices.Marshal class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While existing `unsafe` is useful, it is limited by only applying to pointers and ref-like lifetimes. Many methods may be considered unsafe, but the unsafety may not be related to pointers or ref-like lifetimes. For example, almost all methods in the System.RuntimeServices.CompilerServices.Unsafe class has the same safety issues as pointers, but do not require an `unsafe` block. The same is true of the System.RuntimeServices.CompilerServices.Marshal class. | |
While existing `unsafe` is useful, it is limited by only applying to pointers and ref-like lifetimes. Many methods may be considered unsafe, but the unsafety may not be related to pointers or ref-like lifetimes. For example, almost all methods in the System.RuntimeServices.CompilerServices.Unsafe class have the same safety issues as pointers, but do not require an `unsafe` block. The same is true of the System.RuntimeServices.CompilerServices.Marshal class. |
In addition to compiler enforcement, the following attribute will be added for annotating unsafe members. It is an error to use this attribute directly in C#. Instead, the `unsafe` keyword should be used. | ||
|
||
```C# | ||
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct | AttributeTargets.Method | AttributeTargets.Property | AttributeTargets.Constructor)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I assume if you apply this to a type (class/struct), all members are considered "ReuiresUnsafe"? But note that today in C#, iterator methods can escape unsafe context of a class:
unsafe class C
{
void M1()
{
int* p; // ok
}
IEnumerable<int> M() // iterator method
{
int* p; // error
yield return 0;
}
}
How would RequiresUnsafe semantics apply to the iterator method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh, I don't think it should be allowed on a type level - my understanding is that rust doesn't allow this for the following reason: a type itself cannot be unsafe, only operations can be, and I personally think that is pretty sensible. In saying that, we do have precedence for this in c# already, with pointer types, but I personally tend to think rust's solution of only allowing you to do unsafe things with them in an unsafe context is the better approach. The other reason that I don't think UnsafeCallersOnly should be allowed on a type level is that it feels suspicious to claim that the only possible ways to use a type would be unsafe
, but maybe we need this since we allow default
on any type, which could be an invalid (& hence requiring unsafe) state for a struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a type itself cannot be unsafe
FWIW, pointer types themselves are an example of "unsafe types". For example, it might be considered inconsistent if you couldn't create your custom struct MyPointer
with same unsafe semantics as int*
.
Anyway, how would it work if RequiresUnsafe wouldn't work for types? Users can already declare types as unsafe in C# today and that propagates to its members (except iterators) automatically. If we excluded unsafe types from RequiresUnsafe semantics, would we require users to mark the members separately as unsafe in order to opt them in for RequiresUnsafe? Or just error on unsafe types when EnableRequiresUnsafe is enabled for project?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a type itself cannot be unsafe
FWIW, pointer types themselves are an example of "unsafe types".
This was explaining how it works in rust, not how it works today in c#:
my understanding is that rust doesn't allow this for the following reason: a type itself cannot be unsafe, only operations can be
In rust, the following function is not unsafe-callers-only: fn a(x: *const i32)
(link), however, attempting to perform an operation on the pointer does require unsafe
(link).
I think this is the ideal setup, but I'm not certain it's possible to actually do in c#, since we have default
on struct
s (and new()
), which might represent an unsafe
state on structs (rust avoids this problem, since default
is a trait & you are not required to implement it). It is probably possible to make this work w/o requiring unsafe-callers-only types if we can mark for default
for a struct
being unsafe & if we can keep track of where default
is, or may (e.g., in generics) be, used on it & complain somewhere about doing that (for generics, this would probably be at passing as generic parameter time, unless it's marked with some allows nondefault
anti-constraint).
Anyway, how would it work if RequiresUnsafe wouldn't work for types? Users can already declare types as unsafe in C# today and that propagates to its members (except iterators) automatically.
It also doesn't propogate to async
members. I personally think marking something unsafe-callers-only should be a seperate toggle to marking it unsafe-impl anyway (which is what the unsafe
keyword does in all positions today). Since we are considering massively changing the meaning of the unsafe
keyword means anyway, I think we should change it to whatever makes most sense. I know for a fact that all the types I've added unsafe
to were just for convenience, since they had a lot of members that worked with pointers, but it didn't make the types themselves necessarily actually unsafe
in all possible contexts - I suspect this is also the case for most (or all, excepting the default
/new()
thing on structs I mentioned earlier) other uses.
Additionally, the current proposal doesn't even make marking a type with unsafe
keep the same members having an unsafe context - it explicitly excludes most instance members on class types, explicitly excludes nested types, and includes any generator & async functions not otherwise excluded - these are all differences to what gets an unsafe-impl-context today from the unsafe
keyword on a type.
It also seems to differ here
Note that this does not create a feature requirement for nested types or for members of base classes or interfaces implemented by the type.
Since when you mark a type unsafe-impl today these members do get an unsafe-impl context, but if we change the meaning to getting an unsafe-callers-only context, they shouldn't get that (which makes sense for base/interface members that aren't unsafe-callers-only, but is also another change with what putting unsafe
on a type means today).
Also:
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct | AttributeTargets.Method | AttributeTargets.Property | AttributeTargets.Constructor)]
These targets also seem wrong imo, I didn't notice it earlier as I must have skimmed over that line, but e.g., if I have a pointer field (or any other type we mark unsafe-callers-only), does that require me to mark the whole type unsafe-callers-only, even if it has safe operations on it & just happens to use pointers or other unsafe functionality? Do we need to strictly disallow unsafe-callers-only types as instance fields in classes, since we cannot mark them as unsafe-callers-only, and the current spec specifies that it doesn't get passed down to instance members in classes (other than constructors), thus making it impossible to have them as instance fields?
Or just error on unsafe types when EnableRequiresUnsafe is enabled for project?
If we change the meaning of unsafe
on a type to what the current spec specifies (of which I pointed out a number of differences above), then imo we should at least warn at every pre-existing usage to say "this now means something completely different to what it did before, fix it" anyway.
Lastly, #330 (comment)
If we do not do this, we will have to be explaining why
unsafe
in C# is non-sensical and different from Rust. When we have to be explaining, we are failing.
I think it is inevitable that we will end up with some differences to rust (due to the reasons I'm about to explain), but we could make it the same if we really wanted to, but this would require a massive set of changes to unsafe
- we would need to make pointers safe but operations on them unsafe, we would need to make unsafe
mean unsafe-callers-only
, and we would need to disallow unsafe
on the class/struct level. This is probably the best solution, but I haven't mentioned it explicitly until now as I see it as too breaking, but if we're already breaking whether unsafe
means unsafe-impl or unsafe-callers-only, then I don't see any good reason to stop halfway at fixing what unsafe
means & how it works. The reason I proposed making unsafecallersonly
a seperate keyword initially, was purely because I wanted to avoid making a massive break in what the unsafe
keyword means, but if we're going to change its meaning anyway as much as is currently proposed, then we should do it right imo (& I think rust's solution is more or less the ideal one - we probably need to make a few adjustments for c# (like allowing on properties, events, fields, constructors, & default in addition to methods - as these are all achieved in different ways in rust), but I think it's pretty much right).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh, I don't think it should be allowed on a type level - my understanding is that rust doesn't allow this for the following reason: a type itself cannot be unsafe, only operations can be, and I personally think that is pretty sensible.
In Rust impls can be unsafe, which serves a similar purpose here. C# does not have a meaningful difference between type declaration and implementation.
We will need some way to handle the following problem:
class UnsafeEnumerable : IEnumerable<T>
{
unsafe IEnumerator<T> GetEnumerator() => ...; // use unsafe code
}
We have to guard virtual invocations (interface methods), otherwise you can trivially bypass unsafe. But we can't mark IEnumerable<T>
itself unsafe. So we need some way to handle UnsafeEnumerable
. The way we do it with trimming is that you mark the type itself, which in turn marks all the constructors, but hides errors for the instance members. Because if you managed to construct the type, you already used unsafe
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C# does not have a meaningful difference between type declaration and implementation.
Right, so the purpose of marking a type unsafe-callers-only is to indicate that 1 or more of the interfaces it implements is implemented unsafely? That is really not clear from anything imo (not clear syntactically, also not clear from this document), but I suppose it makes sense if the intent is that it's the only reason (or default
/new()
on structs being an unsafe state) you'd use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good feedback, I think this doc needs some more examples and explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments primarily of questions about this precise approach. I'm not certain what I think the correct answer to them is, but I personally think they are important questions that need some consideration, so I've left them with some of my thoughts about them for everyone to think about to ensure we come up with the best version of the feature possible.
Thanks if you read the whole of my comments 😅
In addition to compiler enforcement, the following attribute will be added for annotating unsafe members. It is an error to use this attribute directly in C#. Instead, the `unsafe` keyword should be used. | ||
|
||
```C# | ||
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct | AttributeTargets.Method | AttributeTargets.Property | AttributeTargets.Constructor)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh, I don't think it should be allowed on a type level - my understanding is that rust doesn't allow this for the following reason: a type itself cannot be unsafe, only operations can be, and I personally think that is pretty sensible. In saying that, we do have precedence for this in c# already, with pointer types, but I personally tend to think rust's solution of only allowing you to do unsafe things with them in an unsafe context is the better approach. The other reason that I don't think UnsafeCallersOnly should be allowed on a type level is that it feels suspicious to claim that the only possible ways to use a type would be unsafe
, but maybe we need this since we allow default
on any type, which could be an invalid (& hence requiring unsafe) state for a struct?
The overall goal is to ensure .NET code is "valid," in the sense that certain properties are always true. Generating a complete list of such properties is out of scope of this document. However, at least the following properties are required: | ||
|
||
* Memory safety | ||
* No access to uninitialized memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it actually possible to guarantee this (No access to uninitialized memory
) w.r.t. the above paragraph? e.g., you can write an ArrayPool
without any unsafe
code, with the only difference being that it would have to be initialized each time it creates a new array for sure; but that doesn't stop the fact that it effectively contains uninitialized values, for the purpose of that usage, on any subsequent rent of that array - if this is not counted as unsafe
, then ArrayPool.Shared.Rent could be made not unsafe-callers-only for the same reason as my custom ArrayPool
above, solely by replacing usages of AllocateUninitializedArray
with new T[]
(& any other similar things, if any), which definitely feels against the intent of the definition of "No access to uninitialized memory". I don't know if I'm missing something, but it seems impossible to guarantee these certain properties are achieved when a developer writes only safe code in full generality with these particular definitions, unless we blame the unsafe-callers-only of ArrayPool.Rent
solely on APIs like AllocateUninitializedArray
rather than also the fact that they're not guaranteed to get cleared on return & re-rent.
Basically, I think we can guarantee memory safety if no unsafe
code exists, but I don't think we can guarantee "no access to uninitialized memory", depending on how it's defined (specifically, whether we only care about the first allocation of the array, or also subsequent uses of the array when re-renting not being zeroed). If the intent is that we only care about the first allocation, it would be good to explicitly mention this, as I don't think that's immediately clear from the document that that's for sure the intent.
|
||
Notably, unsafe did not change the requirement that the code in the block must be correct. It merely offset the responsibility from the language and the runtime to the user in verification. | ||
|
||
For more precise details on the error semantics of unsafe blocks and unsafe members, the rules will mirror the rules defined for "Requires" attributes defined in [Feature attribute semantics](https://github.com/dotnet/runtime/blob/main/docs/design/tools/illink/feature-attribute-semantics.md#requiresfeatureattribute). The only addition is the presence of the `unsafe` block, which effectively provides a local `Requires` context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we allow marking types as unsafe-callers-only, then it probably needs to be for all members imo. This probably also needs to include nested types, as they are textually within the unsafe
block, and get an unsafe
context today (unless we're planning to make massive breaking changes, in which case there's probably other things we could change too, e.g., simply not allowing it on the type to start with, which has much more obvious behaviour than what it would if it copied how Requires
works for contained members - especially since it's a textual keyword, not an attribute). Additionally, some things introduce a safe context today, I think we need to think about having a safe
keyword (& corresponding [AllowsSafeCallers]
if we allow unsafe-callers-only on types) for this reason & 2 other reasons.
Firstly, presumably if a method is marked as safe, then it cannot be overridden with something unsafe-callers-only, unless the implementing type is also unsafe-callers-only, since that would allow something like this:
interface I1
{
void X();
}
// case 2: mark C1 as unsafe-callers-only
// idea 3: have a mechanism to mark C1's implementation of I1 as unsafe - this runs into the problem that you can't ever cast the type to a base type or pass to generic in a safe context, because you could then dynamically discover that it implements I1 in the safe context, allowing you to call I1.X() on it in a safe context also
class C1 : I1
{
public unsafe void X() => ... something unsafe; // potential error spot: can't implement a safe method with an unsafe method
}
C1 c = new(); //if case 2, we get the error here due to C1 being unsafe-callers-only & we are good
I1 i = c;
i.X(); //if there's no error by this point, then we just did something unsafe without being in an unsafe context
But, if we require unsafe-callers-only members to only be able to override other unsafe-callers-only members, we run into a new problem: now we may need to mark an abstract method as unsafe-callers-only for 1 subtype's definition, which now causes all subtypes' implementations as unsafe-callers-only, unless we can opt-out by making it a safe context again somehow (although, this specific thing could presumably be achieved by just not adding the unsafe
keyword on the overriding member).
Now consider I write this
interface I1
{
void X();
}
class C1
{
public unsafe virtual void X() => ...; //something unsafe
}
unsafe class C2 : C1;
unsafe class C3 : C2, I1 //is it legal to implement I1, since I1.X expects a safe impl but we can't make a safe context within an unsafe-callers-only type?
{
public override void X() => ... something not unsafe, but we're in an unsafe context & it's unsafe-callers-only due to class being marked unsafe-callers-only
}
unsafe class C4 : C2, I1 //is it legal to implement I1, since I1.X expects a safe impl?
{
public override void X() => base.X(); //note: this is an unsafe operation
}
I1 i;
C1 c;
// is this valid?
// is it only valid if we know it's specifically C3 & we know that C3's impl for I1 is safe even though it's not guaranteed by any compatibility (source or binary)?
// should casting any instance of any unsafe type to a base class only be valid if we know that all current & future implementations of all non-unsafe-callers-only interface members implemented happen to be implemented with safe implementations, even though they have to be marked unsafe-callers-only due to being in an unsafe-callers-only type?
unsafe { c = GetC2(); }
i = (I1)c;
i.X();
I'm not sure what the correct answers to the above are, but I think it needs careful consideration to make sure we come up with the right solution, and I do think that the ability to introduce a safe context is likely a part of that solution.
The other reason is that I think we should aim to keep unsafe contexts at a minimum, but if you need to mark a whole type as unsafe-callers-only for some reason (e.g., if the default value for a struct type represents a state that is unsafe), then you immediately lose the ability to have any non-unsafe contexts for anything textually inside that type, which I think is unideal.
|
||
In this document **memory safety** is strictly defined as: code can never acquire a reference to memory that is not managed by the runtime. "Managed" here does not refer to solely to heap-allocated, garbage collected memory, but also includes stack-allocated variables that are considered allocated by the runtime, or memory that is acquired for legal use by the runtime through any other means. | ||
|
||
No access to uninitialized memory means that all memory is either never read before it has been initialized, or it has been initialized to a zero value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or it has been initialized to a zero value
Does this mean that FormatterServices.GetUninitializedObject
would be safe, since it initialises everything to 0
? That seems wrong, since we shouldn't have to mark methods unsafe
just because someone might have gotten an instance from FormatterServices.GetUninitializedObject
& hence break our assumptions about our fields being setup valid according to how we modify it in our safe APIs. Can this somehow be adjusted to make sure it correctly classifies FormatterServices.GetUninitializedObject
as problematic, whilst not making other valid zero initialisation become invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I currently don’t see a reason why that method needs to be unsafe, but we could choose to apply the marker anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetUninitializedObject
is private reflection. It should get the same treatment as the rest of private reflection once we figure out what to do about it.
|
||
We need to be able to annotate code as unsafe, even if it doesn't use pointers. | ||
|
||
Mechanically, this would be done with a modification to the C# language and a new property to the compilation. When the compilation property "EnableRequiresUnsafe" is set to true, the `unsafe` keyword on C# _members_ would require that their uses appear in an unsafe context. An `unsafe` block would be unchanged -- the statements in the block would be in an unsafe context, while the code outside would have no requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would EnableRequiresUnsafe
be something that is enabled by default at some point in the future? If not, then I think a lot of pre-existing projects would completely miss out on this safety, since many people will probably not become aware of this feature & don't even look in the csproj ever. It seems unideal that this would happen to me; imo, ideally we enable it by default for the .NET X+ (where X = 10 or 11, etc.) tfms, with an opt-out switch (which would also work as an opt-in switch for older tfms). This way, all projects would eventually benefit from the new unsafe analysis at some point when they upgrade to a new enough .NET version, or they would become aware of it & decide that they will disable it for now, but are at least aware of it & have to make a conscious decision to not enable it.
An approach like this where it becomes a new default would probably be problematic if we make breaking changes to what unsafe
means on types & members (notably, this would be solved if we had a seperate keyword for unsafe-callers-only) - but if we don't enable it by default, we are effectively ensuring that many existing projects will never even become aware of this feature, at least for a substantial period of time (as presumably someone on the team will eventually find out, or it will die), & can't benefit from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would EnableRequiresUnsafe be something that is enabled by default at some point in the future?
This is a breaking change in the language, so that depends on the observed severity of the break.
In the case of `Caller1`, the call to `M()` doesn't produce an error because it is inside an unsafe context. However, calls to `Caller1` will now produce an error for the same reason as `M()`. | ||
|
||
`Caller2` will also not produce an error because `M()` is in an unsafe context. However, this code creates a responsibility for the programmer: by presenting a safe API around an unsafe call, they are asserting that all safety concerns of `M()` have been addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rising verbosity of requiring unsafe
on more places may be annoying. Consider in BCL, we frequently use unsafe constructs for efficient implementation, but usually won't expose unsafety on public API. This would require many unsafe
context to be limited to field or block level, instead of class or method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This document doesn’t just present an annotation, it presents a formalism. By definition, safe code cannot cause memory safety issues because the runtime guarantees the safety. This means that applying unsafe is not a judgment call — if code can cause a memory safety issue, it needs to be unsafe. If not, it does not.
If too much code is unsafe, we would have to narrow the property we’re attempting to prove. Otherwise the formalism is unsound.
|
||
## Proposal | ||
|
||
We need to be able to annotate code as unsafe, even if it doesn't use pointers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Propagating unsafe
warnings too much would be very disappointing and cause developers disabling the new unsafe
mechanism at all. We should take acceptance into consideration.
I think the key point is to make the customers aware of unsafety at use site.
|
||
The overall goal is to ensure .NET code is "valid," in the sense that certain properties are always true. Generating a complete list of such properties is out of scope of this document. However, at least the following properties are required: | ||
|
||
* Memory safety |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's crucial to differentiate the unsafety levels. Type safety and GC safety should get highest severity because of the ability to break runtime. There are continuous tickets for inappropriate unsafe API usage causing hard-to-diagnose bugs.
For example, Unsafe.AsPointer
is super-unsafe and strictly audited in runtime repo. Unsafe.As
between unmanaged types are "safer" and frequently used in buffer management. There should be a mechanism to disallow the former only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type safety and GC safety should get highest severity because of the ability to break runtime
Nearly all memory safety bugs can crash the runtime. For example, incorrect use of NativeMemory.Alloc/Free
can cause very hard to diagnose crash in the runtime code as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The safety property I suggested, memory safety, is derived from an analysis of severe security breaches across the industry. Things in this category are known to be very dangerous and award attackers with significant power. I don’t think there’s further subdivision of this category that is interesting. Unsafe.As and Unsafe.AsPointer are both capable of producing memory safety problems and are therefore both unsafe.
The only other property I’m suggesting is viewing uninitialized memory. This is less severe. We could drop it if necessary. At the moment I’m proposing no other properties that would be covered by the unsafe feature and therefore further categorization is irrelevant.
|
||
## Proposal | ||
|
||
We need to be able to annotate code as unsafe, even if it doesn't use pointers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a widely-seen formalism to disallow unsafe
syntax totally, instead using the Marshal
family which is inefficient and typically more unsafe. We should provide a path for moving people away from that.
|
||
**ArrayPool.Rent** | ||
|
||
This method is unsafe because it returns an array with unintialized memory. Code must not read the contents of the returned array without initialization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is uninitialized defined here? Is this because the initial array is allocated with GC.AllocateUninitializedArray, or is this because someone can rent an array, write data into it, return it, and then someone else renting gets that data?
If it's the former, I'd rather just change the Rent implementation to use new[] instead of AllocateUninitializedArray. If it's the latter, that won't help.
|
||
## Background | ||
|
||
C# has had the `unsafe` feature since 1.0. There are two different syntaxes for the feature: a block syntax that can be used inside methods and a modifier that can appear on members and types. The original semantics only concern pointers. An error is produced if a variable of pointer type appears outside an unsafe context. For the block syntax, this is anywhere inside the block; for members this is inside the member; for types this is anywhere inside the type. Pointer operations are not fully validated by the type system, so this feature is useful at identifying areas of code needing more validation. Unsafe has subsequently been augmented to also turn off lifetime checking for ref-like variables, but the fundamental semantics are unchanged -- the `unsafe` context serves only to avoid an error that would otherwise occur. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original semantics only concern pointers.
It also included things adjacent, like sizeof(T). As part of this effort, I'd like us to revisit those choices. Based on the direction in this doc, seems sizeof(T) should not be unsafe.
|
||
## Background | ||
|
||
C# has had the `unsafe` feature since 1.0. There are two different syntaxes for the feature: a block syntax that can be used inside methods and a modifier that can appear on members and types. The original semantics only concern pointers. An error is produced if a variable of pointer type appears outside an unsafe context. For the block syntax, this is anywhere inside the block; for members this is inside the member; for types this is anywhere inside the type. Pointer operations are not fully validated by the type system, so this feature is useful at identifying areas of code needing more validation. Unsafe has subsequently been augmented to also turn off lifetime checking for ref-like variables, but the fundamental semantics are unchanged -- the `unsafe` context serves only to avoid an error that would otherwise occur. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original semantics only concern pointers.
It also included things adjacent, like sizeof(T). As part of this effort, I'd like us to revisit those choices. Based on the direction in this doc, seems sizeof(T) should not be unsafe.
No description provided.