SC: Implement the core functionalities of sync calls #1251

linh2931 · 2025-03-12T19:42:13Z

This PR implements the core functionalities of sync calls, in particular the call host function. The Unit tests show basic call flows are working:

Sync calls whose receivers are the same as the senders
Sync calls whose receivers are different from the senders
Nested sync calls
Recursive sync calls
Sequential sync calls

Notes for reviewing:

Please roughly follow the order of libraries/chain/webassembly/sync_call.cpp, libraries/chain/apply_context.cpp and libraries/chain/webassembly/runtimes/eos-vm.cpp to review
eosvm-oc not working yet. Make OC allow for nested Executor execution #1043 (sync call unit tests skip eosvm-oc)
Please double check if transaction_checktime_timer related changes are correct
Tests will be updated using return values and console output after sync call return value and call trace are implemented (next immediate tasks)
Reset of apply_context iterator caches is tracked by a separate issue for keeping this PR focused on main call flow SC: Invalidate iterator_cache in apply_context after a sync call #1252

Resolves #1214

…call

…backs

linh2931 · 2025-03-12T19:50:55Z

libraries/chain/include/eosio/chain/platform_timer.hpp

@@ -26,17 +26,21 @@ struct platform_timer {
      callback still execute if the timer expires and stop() is called nearly simultaneously.
      However, set_expiration_callback() is synchronized with the callback.
   */
-   void set_expiration_callback(void(*func)(void*), void* user) {
+   void set_expiration_callback(void(*func)(void*), void* user, bool appending = false) {


If the approach is deemed OK, I will replace this default parameter with regular parameter, and define an enum for the boolean, like {multi_callbacks_allowed, multi_callbacks_disallowed}.

libraries/chain/webassembly/runtimes/eos-vm.cpp

…nc_call entry point

…th sync_call_context to better reflect what they are and make sync_call_context fit into apply_context better

…ntext

spoonincode · 2025-03-14T22:30:36Z

libraries/chain/wasm_interface.cpp

@@ -91,6 +91,11 @@ namespace eosio { namespace chain {
      my->apply( code_hash, vm_type, vm_version, context );
   }

+   void wasm_interface::do_sync_call( const digest_type& code_hash, const uint8_t& vm_type, const uint8_t& vm_version, apply_context& context ) {
+      ilog("wasm_interface::do_sync_call");


looks like a debug message left in

spoonincode · 2025-03-14T22:39:13Z

libraries/chain/include/eosio/chain/platform_timer.hpp

         _callback_variables_busy.store(false, std::memory_order_release);
      }
   }

   std::atomic_bool _callback_variables_busy = false;
-   void(*_expiration_callback)(void*) = nullptr;
-   void* _expiration_callback_data;
+   std::vector<std::pair<void(*)(void*), void*>> _expiration_callbacks;


I've been thinking about the changes in this file some (or more generally just the semantics of timeout in the code). I am not 100% sure I completely grok the world yet to know for sure all the ramifications.

I think the code as is probably has a race. Notice how the callbacks are not called if the "spin lock" is currently held (it behaves like a "trylock"). This "lock" is held in set_expiration_callback(). So consider when a sync call completes the timed_run completes and then goes to remove its callback and during the removal of the callback the timer expires. This expiration will be ignored from EOS VM's perspective (all code sections remain executable). So then if we reenter the caller's JIT code that JIT code can no longer be timed out and can run forever.

I think for this particular example it can be resolved by ensuring we do a manual checktime() call after the callee completes but before we return control to the caller.

Thanks for thinking through this!

I added a checktime() before returning to the caller. ab70d89

spoonincode · 2025-03-14T22:39:40Z

libraries/chain/include/eosio/chain/platform_timer.hpp

+         return;
+      }
+
+      _expiration_callbacks.push_back({func, user});


It does seem kind of unfortunate we need to maintain a stack of callbacks when it seems like there will only effectively be 2 unique ones registered. We could remove the stack with something like a swap_expiration_callback() and place the onus on restoring the value on whoever did the swap (much like we have a requirement whoever registers a callback today clears it). Some other approaches I can think of maybe go against JITs current design (for example instead of wiring up the callback to a timed_run we somehow wire it up directly to the backend allocator). I guess we can keep this as is for now but it'd be nice to find something better.

spoonincode · 2025-03-14T23:27:13Z

libraries/chain/webassembly/runtimes/eos-vm.cpp

+      void do_sync_call(apply_context& context) override {
+         backend_t                                bkend;
+         typename eos_vm_runtime<Impl>::context_t exec_ctx;
+         vm::wasm_allocator                       wasm_alloc;


It would be preferable to not create and destroy this for every call; it's "expensive" from the standpoint of allocating and freeing pages each time.

Can leave like this for now but,

Have you checked that this is not leaking memory? wasm_allocator requires an explicit free() call to free its resources and I don't believe it's called anywhere.

wasm_allocator requires an explicit free() call to free its resources

Odd that it mmaps in its constructor and doesn't have a destructor to munmap.

Yeah I do suspect free() should just be moved to be the dtor. Especially since it looks like free() isn't being called anywhere. (I haven't investigated it much though)

Thanks. I changed to use a pre-created pool of allocators to avoid expensive allocation for each sync call.

48c3e59

greg7mdp · 2025-03-17T14:40:05Z

libraries/chain/include/eosio/chain/apply_context.hpp

@@ -621,6 +628,8 @@ class apply_context {
      bool                          privileged   = false;
      bool                          context_free = false;

+      std::optional<sync_call_context> sync_call_ctx{};  // only one of act and sync_call_ctx can be present


Instead of having apply_context having members that can be there or not, I think it would be cleaner to have something like (also that would remove the need for a separate class sync_call_context):

struct context_base { ... // all common members }; struct apply_context : public context_base { uint32_t recurse_depth; ///< how deep inline actions can recurse uint32_t first_receiver_action_ordinal = 0; uint32_t action_ordinal = 0; }; struct call_context : public context_base { account_name sender; account_name receiver; std::span<const char> data; // includes function name, arguments, and other information };

Thanks. The Implementation section of the design document talked about this too. This PR focuses on demonstrating the feasibility of sync calls. I will create a task to track it.

libraries/chain/apply_context.cpp

heifner · 2025-03-18T14:24:59Z

libraries/chain/apply_context.cpp

+
+action_name apply_context::get_sync_call_sender() const {
+   // Current context's receiver is the sender of next sync call
+   return sync_call_ctx ? sync_call_ctx->receiver : get_receiver();


Should this be an assert? Is it valid to call outside of a sync call? Also I don't understand what the comment is trying to convey.

huh this makes me realize -- is there a way for a sync call to see who its caller is? (and if not, is that something desirable?)

We have sender argument in the sync_call entry point.

ah okay good missed that

Should this be an assert? Is it valid to call outside of a sync call? Also I don't understand what the comment is trying to convey.

A sync call be be initiated by a sync call (using sync_call_ctx) or an action (using apply_context). It is valid for this function to be called outside of a sync call.

I clarified the comments and move the function to the private section.

c2f95c0

heifner · 2025-03-18T14:26:48Z

libraries/chain/include/eosio/chain/apply_context.hpp

@@ -597,8 +601,11 @@ class apply_context {
      bool is_privileged()const { return privileged; }
      action_name get_receiver()const { return receiver; }
      const action& get_action()const { return *act; }
+      const action* get_action_ptr()const { return act; }


Is this only used to assert? If so maybe the method should be something more like assert_sync_call.

All the validations which disallow certain functions under sync calls are planned to be done by #1218. I updated its description.

heifner · 2025-03-18T15:05:01Z

unittests/sync_call_tests.cpp

+   validating_tester t;
+
+   if( t.get_config().wasm_runtime == wasm_interface::vm_type::eos_vm_oc ) {
+      // skip eos_vm_oc for now.


OC changes have not been done.

heifner · 2025-03-18T15:25:13Z

libraries/chain/webassembly/runtimes/eos-vm.cpp

+         typename eos_vm_runtime<Impl>::context_t exec_ctx;
+         vm::wasm_allocator                       wasm_alloc;
+
+         const std::optional<sync_call_context> sync_call_ctx  = context.get_sync_call_ctx();


Seems expensive to make this copy of the sync call data.

…e condition

spoonincode · 2025-03-18T19:12:41Z

libraries/chain/webassembly/runtimes/eos-vm.cpp

         } catch(eosio::vm::timeout_exception&) {
            context.trx_context.checktime();
         } catch(eosio::vm::wasm_memory_exception& e) {
            FC_THROW_EXCEPTION(wasm_execution_error, "access violation: ${d}", ("d", e.detail()));
         } catch(eosio::vm::exception& e) {
            FC_THROW_EXCEPTION(wasm_execution_error, "eos-vm system failure: ${d}", ("d", e.detail()));
         }
+
+         if (multi_expr_callbacks_allowed) {
+            context.trx_context.checktime(); // protect against the case where during the removal of the callback, the timer expires.


I haven't thought it through entirely but my hunch is this check needs to be higher up (like maybe at the end of execute_sync_call() or such). Think about a case where the callee is being run by OC but the caller by JIT. We would need to add this checktime in both OC & JIT; or preferably just one common place higher up.

I moved it to execute_sync_call for now and added a comment to the OC issue. We will find a common place when doing OC change.

fc22049

…ach call

spoonincode · 2025-03-18T19:46:35Z

libraries/chain/webassembly/runtimes/eos-vm.cpp

@@ -152,6 +152,7 @@ class eos_vm_instantiated_module : public wasm_instantiated_module_interface {
         };

         execute(context, bkend, exec_ctx, wasm_alloc, fn, true);
+         context.control.return_sync_call_wasm_allocator(); // return the wasm_allocator obtainded by get_sync_wasm_allocator back to the pool


Does this need to be make_scoped so it still runs upon exception?

alternatively maybe sync_call_wasm_alloc_index can just be reset to 0 upon on a new apply context

We cannot just reset to 0 upon on a new apply context, as the apply context can make multiple sequential sync calls. I changed to use scoped exit mechanism.

06bdd12

I added a test to make 500 sync calls in a loop to verify wasm allocators are properly released after each call (we are not running out of them).

b1cbe35

spoonincode · 2025-03-18T19:59:51Z

libraries/chain/controller.cpp

@@ -1013,6 +1013,8 @@ struct controller_impl {
   thread_local static platform_timer timer; // a copy for main thread and each read-only thread
 #if defined(EOSIO_EOS_VM_RUNTIME_ENABLED) || defined(EOSIO_EOS_VM_JIT_RUNTIME_ENABLED)
   thread_local static vm::wasm_allocator wasm_alloc; // a copy for main thread and each read-only thread
+   thread_local static std::vector<vm::wasm_allocator> sync_call_wasm_alloc; // used for sync call. expensive to create one for each call
+   uint32_t            sync_call_wasm_alloc_index;


Is this really okay not being thread_local? Wouldn't there be a problem for sync calls in parallel run read only transactions?

Thanks for your good eyes! I intended it to be thread_local but did not do that. Will correct it.

It was a thread_local below in

spring/libraries/chain/controller.cpp

Line 5073 in fc22049

thread_local uint32_t sync_call_wasm_alloc_index = 0;

Fixed in 06bdd12

spoonincode · 2025-03-18T20:00:27Z

libraries/chain/controller.cpp

@@ -1301,6 +1303,9 @@ struct controller_impl {
         if( shutdown ) shutdown();
      } );

+      sync_call_wasm_alloc.resize(16); // Will change to use max_sync_call_depth of global property object in future version


Seems like this would need to be resized on each read only thread too?

It might make more sense to do more of that work on #1257 -- it would have to resize upon the parameters changed etc

Yes, on #1257, I will do the resize changes using the new parameters.

I added the initialization of sync_call_wasm_alloc on read threads at start up 06bdd12. Will update it in #1257 with actual sync_call_depth.

I created a task for resizing the vector as it requires some experiment. #1269

Probably also dependent on if we allow privileged sync calls, since a privileged sync call might, for example, reduce the sync call depth beyond the current depth

…read_local definition for sync_call_wasm_alloc_index

…llocators in all read-only threads and the main thread

spoonincode · 2025-03-21T16:54:52Z

It seems like a sync call can not be privileged. I can't tell if that's intentional or not (it's not mentioned in design doc), and I don't know what to immediately think about it (i.e. does it prevent useful use cases of sync calls?)

linh2931 · 2025-03-21T21:06:29Z

It seems like a sync call can not be privileged. I can't tell if that's intentional or not (it's not mentioned in design doc), and I don't know what to immediately think about it (i.e. does it prevent useful use cases of sync calls?)

@arhag, any comments?

spoonincode · 2025-03-26T21:53:28Z

@linh2931 which PR do you want to plumb in privileged support?

linh2931 · 2025-03-26T22:08:40Z

@linh2931 which PR do you want to plumb in privileged support?

I plan to do it in the restructured sync_call_context class and need to consider changing max_sync_call_depth while in a sync call (probably not allow). I just created an issue to track the task
#1279

spoonincode · 2025-03-27T02:24:02Z

libraries/chain/include/eosio/chain/apply_context.hpp

+      std::optional<sync_call_context> sync_call_ctx{};  // only one of act and sync_call_ctx can be present
+
+      // Returns the sender of any sync call initiated by this apply_context or its sync_call_ctx
+      action_name get_sync_call_sender() const;


It is strange that this is action_name for a name that isn't an action. But I see it's the same for get_sender()

In #1273, get_sync_call_sender() is replaced with the virtual function get_sender(). Maybe the return type action_name should be changed to account_name.

Yeah and it's still action_name there too. Maybe could change it in that PR since so much else is being changed.

ericpassmore · 2025-04-03T01:15:10Z

Note:start
category: Other
component: SyncCalls
summary: Initial commit of core functionality for sync calls.
Note:end

linh2931 added 7 commits March 12, 2025 14:01

Update host function call to use actual apply_conetxt's execute_sync_…

755fdba

…call

Implement apply_context::execute_sync_call()

372fe98

Add sync_call type definition

b78a3ae

Implement changes related to eos-vm (wasm)

a9cc89c

Implement eos_vm_instantiated_module::do_sync_call()

ac29bbd

Make transaction_checktime_timer to support multiple expiration call …

6b6c66f

…backs

Add unit tests for sync calls

af2df59

linh2931 requested review from spoonincode and heifner March 12, 2025 19:42

linh2931 commented Mar 12, 2025

View reviewed changes

libraries/chain/webassembly/runtimes/eos-vm.cpp Show resolved Hide resolved

linh2931 added 6 commits March 12, 2025 16:16

Update protocol_feature_tests/sync_call_activation_test to include sy…

f83cd99

…nc_call entry point

Rename sync_call.hpp with sync_call_context.hpp and sync_call type wi…

b250deb

…th sync_call_context to better reflect what they are and make sync_call_context fit into apply_context better

Construct sync_call_ctx at apply_context constructor

b833fe9

Implement apply_context::get_sync_call_sender()

859d77b

Assert only one of action and sync call can be present in an apply_co…

14e9cf7

…ntext

Refactor eos_vm_instantiated_module::apply() to use new common methods

fc53065

spoonincode reviewed Mar 14, 2025

View reviewed changes

greg7mdp reviewed Mar 17, 2025

View reviewed changes

remove leftover debugging logging statements

dcfaa50

heifner suggested changes Mar 18, 2025

View reviewed changes

heifner reviewed Mar 18, 2025

View reviewed changes

Do a checktime() before returning to the sync caller to prevent a rac…

ab70d89

…e condition

spoonincode reviewed Mar 18, 2025

View reviewed changes

Use a wasm allocator pool to avoid expensive creating allocator for e…

48c3e59

…ach call

spoonincode reviewed Mar 18, 2025

View reviewed changes

linh2931 mentioned this pull request Mar 18, 2025

Make OC allow for nested Executor execution #1043

Closed

Move checktime to execute_sync_call

fc22049

spoonincode reviewed Mar 18, 2025

View reviewed changes

linh2931 added 8 commits March 18, 2025 17:35

Use scoped exit for returning sync_call_wasm_allocator and correct th…

06bdd12

…read_local definition for sync_call_wasm_alloc_index

Initialze sync_call_wasm_alloc for read threads

06795b6

Add a test to make a large number of sync calls in a loop

b1cbe35

Update verifyOcVirtualMemory to account for memory used by all wasm a…

be211a1

…llocators in all read-only threads and the main thread

Use sha256's empty() instead of comparing with digest_type()

81a6020

Clarify the comments about get_sync_call_sender and move it to private

c2f95c0

Make get_sync_call_ctx() return a const& so it is cheap to use

d031917

Use reference for get_sync_call_wasm_allocator and get_sync_call_ctx

6481988

heifner approved these changes Mar 24, 2025

View reviewed changes

linh2931 mentioned this pull request Mar 26, 2025

SC: privileged support #1279

Closed

spoonincode approved these changes Mar 27, 2025

View reviewed changes

linh2931 merged commit e52828d into sync_call Mar 27, 2025
36 checks passed

linh2931 deleted the call_implementation branch March 27, 2025 11:52

SC: Implement the core functionalities of sync calls #1251

SC: Implement the core functionalities of sync calls #1251

Uh oh!

Conversation

linh2931 commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linh2931 Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

linh2931 Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

heifner Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

linh2931 commented Mar 12, 2025 •

edited

Loading

linh2931 Mar 12, 2025 •

edited

Loading

linh2931 Mar 19, 2025 •

edited

Loading

heifner Mar 18, 2025 •

edited

Loading

linh2931 Mar 27, 2025 •

edited

Loading