-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mono][tasks] Extract EmitWasmBundle into generic EmitBundle task and enable bundling in mono self-contained library #84191
Conversation
Tagging subscribers to this area: @directhex Issue Detailsnull
|
92c21a6
to
6cf49fe
Compare
6cf49fe
to
9084234
Compare
/azp run runtime-wasm |
Azure Pipelines successfully started running 1 pipeline(s). |
Looking at other instances where there is runtime/src/mono/mono/mini/monovm.c Lines 135 to 163 in 79fda00
runtime/src/mono/mono/metadata/assembly.c Lines 2741 to 2754 in 79fda00
The new bundling apis rely on a hashtable, which has the feature of aliasing known assembly extensions The latter specifically mentions bundle, so it seems "safe" to remove that block altogether, but not sure if the former is specific to bundling. It also relies on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just couple of lazy questions, looks great!
I will follow-up with #87284
if (!strcmp (bsymfile->aname, assembly_name)) | ||
return TRUE; | ||
#ifdef ENABLE_WEBCIL | ||
const char *p = strstr (assembly_name, ".webcil"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does (unbundled) webcil still work ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which tests would validate unbundled webcil scenario? I think I was having trouble knowing how to enable webcil when running wasm/wasi tests.
This function, however, is only invoked in the two bundling specific APIs right below.
The new bundling logic will alias all known extensions of assemblies .dll
, .wasm
, .webcil
(if enabled) to .dll
, so that helps eliminate a chunk of logic that had searched for the different variations, hence the logic to check for .webcil
or .wasm
is no longer needed in bundling search paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
webcil is currently useful for browser. The debugger probably doesn't know about the format difference and possibly not even about name difference. I don't know if we have automated tests for debugger on webcil.
System.Runtime.Loader.Tests on wasm tests the satellite assembly bundling logic, registration and lookup are working as intended and all of those tests pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review in progress..
namespace Microsoft.WebAssembly.Build.Tasks; | ||
|
||
public class EmitWasmBundleObjectFiles : EmitWasmBundleBase | ||
public class EmitBundleObjectFiles : EmitBundleBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use some namespace for this. can be done in a follow up PR, along with adding common namespace for the other tasks in this assembly.
private Dictionary<string, string> resourceDataSymbolDictionary = new(); | ||
|
||
private Dictionary<string, string[]> resourcesForDataSymbolDictionary= new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Private instance fields should start with _
like _resourceDataSymbolDictionary
.
And these two should be readonly
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefixed with _
What is the rationale for making them readonly
? These variables are being modified within the Execute()
method, is that considered the constructor for tasks, in which readonly fields should only be modified within as suggested here.
/// Could have RegisteredName, otherwise it would be the filename. | ||
/// RegisteredName should be prefixed with namespace in form of unix like path. For example: "/usr/share/zoneinfo/" | ||
[Required] | ||
public ITaskItem[] FilesToBundle { get; set; } = default!; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public ITaskItem[] FilesToBundle { get; set; } = default!; | |
public ITaskItem[] FilesToBundle { get; set; } = Array.Empty<ITaskItem>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed, out of curiosity, why is this preferred over the default keyword?
string? managedAssemblyCulture = null; | ||
|
||
var resourceMetadataReader = PEReaderExtensions.GetMetadataReader(resourcePEReader); | ||
if (resourceMetadataReader.IsAssembly) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this return true for satellite assemblies also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe it does. Put in some logging and saw that this block was hit for satellite assemblies as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review in progress..
bundledResource.SetMetadata("RegisteredName", registeredName); | ||
} | ||
|
||
string resourceDataSymbol = $"bundled_resource_{ToSafeSymbolName(TruncateEncodedHash(Utils.ComputeHashEx(resourcePath, Utils.HashAlgorithmType.SHA256, Utils.HashEncodingType.Base64Safe), MaxEncodedHashLength))}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output filename should be more descriptive though, like:
string destinationFileName = $"{Path.GetFileName(resourcePath)}_{ToSafeSymbolName(TruncateEncodedHash(Utils.ComputeHashEx(resourcePath, Utils.HashAlgorithmType.SHA256, Utils.HashEncodingType.Base64Safe), MaxEncodedHashLength))}_bundle{GetOutputFileExtension()}";
EDIT: .. which would be useful when debugging a build. It might be even be useful to emit these into a subdir bundles
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that we will share identical bundled assets so we just generate one bundled item instead of duplicates for identical content, that means multiple different resources will map the same destination file, so that is why we can't use the file name as part of the destination file name, only the hash. Inside the generated file (if its a source file) there is a list of all resources that maps the same destination file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful to add more logging around this to aid potential debugging mapping input asset -> destination file, but we would need to keep the naming based on the unique content of the input asset in order to only emit one bundled resource for identical input assets. One example where we could have identical input is time zone info, in that case we will register the different time zones using its unique resource name in runtime, but we share one binary representation if several time zones end up with identical time zone data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_resourcesForDataSymbolDictionary
tracks which resource registered names have the same file contents and therefore have the same symbol names. I'll add logging around _resourcesForDataSymbolDictionary
logic to note when multiple resources have been mapped to the same destination file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
/azp run runtime-wasm |
Azure Pipelines successfully started running 1 pipeline(s). |
Contributes to #79377
In order for Mono's self-contained shared|static library mode to truly be self-contained (i.e. not relying on external resources such as compiled managed assemblies existing on disk), bundling the resources into the shared library (i.e. storing the data of necessary resources directly inside the shared library itself) is necessary. Wasm and Wasi already has a form of bundling implemented in #82250.
This PR looks to extract the bundling logic out of wasm specific files to allow other mobile platforms leverage the technology, and incorporate bundling into mono's library mode built through #81919 with runtime initialization through #82253 and #83050.
This PR does the following:
Extract Wasm's bundling task into a more general task
Incorporates bundling into Mono's self-contained library mode
Expand bundling to handle various types of resources (assemblies + associated pdb, satellite assemblies, other data resources [i.e. runtimeconfig.bin/timezones])
MonoBundle*Resource
structs all aligned off ofMonoBundledResource
to handle various typesMonoBundled*Resource
sModifies EmitBundle input parameters
RegisteredFile
metadata designating what the resource should be registered under in the hashtable.mono_bundled_resources_add
bundling api.MonoBundled*Resource
s and define passed in registration functionOutput generated bundled resource files from EmitBundle task
DataSymbol
,DataLenSymbol
,DataLenSymbolValue
,RegisteredName
, andDestinationFile
metadataDeprecates old bundling api in favor of new bundling api
Example Files
snippet of source file containing byte array data + lengths
source file to register preallocated resource structs, including typedefs and resource symbols
Testing
Mono's self-contained library mode without bundling had been tested in #83050
It was done by building an Android Functional Test with changes that
NativeLib=shared
andForceFullAOT=true
(in addition to other tweaks to generate the app in library mode)DOTNET_LIBRARY_ASSEMBLY_PATH
In order to validate bundling,
DOTNET_LIBRARY_ASSEMBLY_PATH
was not set, andBundlesAssemblies
was passed to the LibraryBuilder. The app was able to bundle assemblies + runtimeconfig.bin, initialize runtimeconfig, auto initialize mono runtime, and invoke the managed side API properly.