-
Notifications
You must be signed in to change notification settings - Fork 237
Emit deallocation instructions for all supported languages #1391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@alexcrichton might be the best to give some insight here as to whether this is possible or if there are other constraints |
About |
Sorry I haven't kept up-to-date on the other PR referenced here. Can you describe why you think other languages are already leaking and how this solves things? We have some tests for "shouldn't leak" in various places, although they're not comprehensive so it doesn't mean something didn't slip through the cracks. Given all the test failures here though I'm not sure that this PR as-is will work? |
According to the wit-bindgen doc for the exported functions, the guest must free the memory wit-bindgen/crates/c/README.md Line 95 in ca55f12
The PR aims to parse all function parameters and emit a deallocation instruction when possible. ExampleGiving the following wit file: package tests:component@0.1.0;
world test {
export accept-string: func(s: string, b: string);
} C# was leakingSee: C is leakingMissing // Generated by `wit-bindgen` 0.46.0. DO NOT EDIT!
#include "test.h"
#include <stdlib.h>
#include <string.h>
// Exported Functions from `test`
// Canonical ABI intrinsics
__attribute__((__weak__, __export_name__("cabi_realloc")))
void *cabi_realloc(void *ptr, size_t old_size, size_t align, size_t new_size) {
(void) old_size;
if (new_size == 0) return (void*) align;
void *ret = realloc(ptr, new_size);
if (!ret) abort();
return ret;
}
// Helper Functions
void test_string_set(test_string_t *ret, const char*s) {
ret->ptr = (uint8_t*) s;
ret->len = strlen(s);
}
void test_string_dup(test_string_t *ret, const char*s) {
ret->len = strlen(s);
ret->ptr = (uint8_t*) cabi_realloc(NULL, 0, 1, ret->len * 1);
memcpy(ret->ptr, s, ret->len * 1);
}
void test_string_free(test_string_t *ret) {
if (ret->len > 0) {
free(ret->ptr);
}
ret->ptr = NULL;
ret->len = 0;
}
// Component Adapters
__attribute__((__export_name__("accept-string")))
void __wasm_export_exports_test_accept_string(uint8_t * arg, size_t arg0, uint8_t * arg1, size_t arg2) {
test_string_t arg3 = (test_string_t) { (uint8_t*)(arg), (arg0) };
test_string_t arg4 = (test_string_t) { (uint8_t*)(arg1), (arg2) };
exports_test_accept_string(&arg3, &arg4);
}
// Ensure that the *_component_type.o object is linked in
extern void __component_type_object_force_link_test(void);
__attribute__((used))
void __component_type_object_force_link_test_public_use_in_this_compilation_unit(void) {
__component_type_object_force_link_test();
} CPP is not leakingIf I read correctly the // Generated by `wit-bindgen` 0.46.0. DO NOT EDIT!
// Ensure that the *_component_type.o object is linked in
#ifdef __wasm32__
extern "C" void __component_type_object_force_link_test(void);
__attribute__((used))
void __component_type_object_force_link_test_public_use_in_this_compilation_unit(void) {
__component_type_object_force_link_test();
}
#endif
#include "test_cpp.h"
#include <cstdlib> // realloc
extern "C" void *cabi_realloc(void *ptr, size_t old_size, size_t align, size_t new_size);
__attribute__((__weak__, __export_name__("cabi_realloc")))
void *cabi_realloc(void *ptr, size_t old_size, size_t align, size_t new_size) {
(void) old_size;
if (new_size == 0) return (void*) align;
void *ret = realloc(ptr, new_size);
if (!ret) abort();
return ret;
}
extern "C" __attribute__((__export_name__("accept-string")))
void accept_string(uint8_t * arg0, size_t arg1, uint8_t * arg2, size_t arg3)
{
auto len0 = arg1;
auto len1 = arg3;
exports::test::AcceptString(wit::string((char const*)(arg0), len0), wit::string((char const*)(arg2), len1));
}
// Component Adapters /// A string in linear memory, freed unconditionally using free
///
/// A normal C++ string makes no guarantees about where the characters
/// are stored and how this is freed.
class string {
uint8_t const *data_;
size_t length;
// C++ is horrible!
//constexpr uint8_t const *const empty_ptr = (uint8_t const *)1;
static uint8_t const* empty_ptr() { return (uint8_t const *)1; }
public:
// this constructor is helpful for creating vector<string>
string(string const &b) : string(string::from_view(b.get_view())) {}
string(string &&b) : data_(b.data_), length(b.length) { b.data_ = nullptr; }
string &operator=(string const &) = delete;
string &operator=(string &&b) {
if (data_ && data_!=empty_ptr()) {
free(const_cast<uint8_t *>(data_));
}
data_ = b.data_;
length = b.length;
b.data_ = nullptr;
return *this;
}
string(char const *d, size_t l) : data_((uint8_t const *)d), length(l) {}
char const *data() const { return (char const *)data_; }
size_t size() const { return length; }
bool empty() const { return !length; }
~string() {
if (data_ && data_!=empty_ptr()) {
free(const_cast<uint8_t *>(data_));
}
}
// leak the memory
void leak() { data_ = nullptr; }
// typically called by post
static void drop_raw(void *ptr) { free(ptr); }
std::string_view get_view() const {
return std::string_view((const char *)data_, length);
}
std::string to_string() const {
return std::string((const char *)data_, length);
}
static string from_view(std::string_view v) {
if (!v.size()) return string((char const*)empty_ptr(), 0);
char* addr = (char*)malloc(v.size());
memcpy(addr, v.data(), v.size());
return string(addr, v.size());
}
char* begin() {
return (char*)data_;
}
char* end() {
return (char*)data_ + length;
}
char const* begin() const {
return (char const*)data_;
}
char const* end() const {
return (char const*)data_ + length;
}
}; RustI'm not sure what happens here and if the borrow checker is involved somehow |
Aha ok thanks for the explanation! I think I see what's going on here. The intent is that the pointer/length used to create a string when receiving it from the outside world in this case is an owned allocation. The guest language, in this case the bindings generator, is responsible for freeing it. This is why Rust/C++ do not leak is because they have codified that they take ownership of the allocation. Rust, for example, avoids copying the string as a result and uses the allocation in-place. Your example with C does indeed leak but that's because it doesn't explicitly call a deallocation function. I realize that what I'm saying here is "your C code leaks because you didn't write it to not leak" which can feel a bit odd, but C isn't a great example of leaks/resource management since everything is manual anyway so the semantics of the bindings generator kind of don't matter. Effectively to write correct C you have to have all the rules in your head at all times. For C# I think the correct behavior here will be to convert the pointer/length to a C#-native string and then deallocate the original allocation. That should prevent leaks and is what's required if C# can't use the pointer/length as-is at rest. Given the configurability of what can be done with the pointer allocation, I'd prefer to leave this up to individual bindings generators to decide what to do rather than codify that it's always free'd. |
To give an example of what Alex is talking about, in the StarlingMonkey JS runtime we use the C bindings, but create JS strings directly from the pointer/length pair without copying. If the bindings deallocated unconditionally, we'd be forced to do an extra copy instead |
Makes sense, as in c# we are using .net Not suggesting we add that here, just a thought for the future. |
I'm not very satisfied with #1389 because as I wrote, other languages still leaking.
I saw the existence of the following instructions which seems to be not used (or at least I didn't find a way to trigger them).
wit-bindgen/crates/core/src/abi.rs
Lines 525 to 557 in 545dc8d
My proposal is quite simple: parse all sign parameters, push them to the stack and deallocate. This should solve all memory leaks for all languages (a revert of #1389 is necessary btw).
I'm not sure about that
Deallocate::Lists
parameter.Also the generation fails when the sign is marked as
indirect_params
, not sure how to do procede yet.Appreciate a feedback!
@jsturtevant @yowl @pavelsavara