Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(decoder): Add support for Structured CLP IR streams but without log-level filtering. #85

Merged
merged 10 commits into from
Nov 15, 2024

Conversation

junhaoliao
Copy link
Member

@junhaoliao junhaoliao commented Oct 1, 2024

References

https://github.com/y-scope/clp-ffi-js/releases/tag/v0.3.0 added support for decoding Structured CLP IR streams but without log-level filtering. In the decoded results returned by the reader, the message contains a complete JSON-serialized log event. Before display the log events in the log viewer, we can use the existing formatter for JSON logs to format the log events to improve readbility.

Description

  1. Update clp-ffi-js to 0.3.0.
  2. Add Structured CLP IR stream handling into ClpIrDecoder but without log-level filtering.

Validation performed

  1. Referred to Add support for deserializing and decoding v0.1.0 IR streams, but without log-level parsing and filtering. clp-ffi-js#30 , loaded the test-irv2.clp.zip file in the browser and observed the 1000 log events displayed as formatted messages in the readonly editor.
    image
  2. Referred to Add support for deserializing and decoding v0.1.0 IR streams, but without log-level parsing and filtering. clp-ffi-js#30 , sanitize tested the "Zstd-compressed CLP IR stream files with older versions".

Summary by CodeRabbit

  • New Features

    • Updated dependency for clp-ffi-js to enhance performance and stability.
    • Added support for bigint in timestamp handling for improved data processing.
  • Improvements

    • Enhanced flexibility in decoder options across multiple classes, allowing for better customization.
    • Improved handling of different stream types in the ClpIrDecoder for more robust data parsing.
  • Bug Fixes

    • Corrected type definitions to ensure consistency across the application, reducing potential errors.
  • Documentation

    • Updated comments and documentation to reflect changes in type definitions and options.

Copy link

coderabbitai bot commented Oct 1, 2024

Walkthrough

The pull request includes updates to various files within the project, primarily focusing on the type definitions related to decoder options. The package.json file has been modified to upgrade the clp-ffi-js dependency. Several classes, including LogFileManager, ClpIrDecoder, and JsonlDecoder, have had their method signatures updated to use a new DecoderOptions type instead of the deprecated DecoderOptionsType. Additionally, the dayjsBigIntSupport plugin has been added to the MainWorker. Other changes include updates to utility functions and type definitions across the codebase.

Changes

File Change Summary
package.json Updated dependency version: "clp-ffi-js": "^0.2.0""clp-ffi-js": "^0.3.0"
src/services/LogFileManager/index.ts Updated method signatures to replace DecoderOptionsType with DecoderOptions in multiple methods.
src/services/MainWorker.ts Added dayjsBigIntSupport plugin to dayjs, no changes to existing logic.
src/services/decoders/ClpIrDecoder.ts Renamed ClpIrStreamReader to ClpStreamReader, updated constructor and method signatures to use DecoderOptions.
src/services/decoders/JsonlDecoder/index.ts Updated constructor and method signatures to use DecoderOptions instead of JsonlDecoderOptionsType.
src/services/decoders/JsonlDecoder/utils.ts Updated convertToDayjsTimestamp function to accept bigint as a parameter type.
src/typings/config.ts Changed type for DECODER_OPTIONS in ConfigMap from JsonlDecoderOptionsType to DecoderOptions.
src/typings/decoders.ts Renamed JsonlDecoderOptionsType to DecoderOptions, updated related method signatures and exports.
src/typings/worker.ts Updated WorkerReqMap to use DecoderOptions instead of DecoderOptionsType.
src/utils/config.ts Updated import and type casting for DecoderOptions in getConfig function.

Possibly related PRs

Suggested reviewers

  • kirkrodrigues

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@junhaoliao junhaoliao changed the title WIP: Add initial code for ClpIrV2Decoder. WIP: Add Structured CLP IR stream handler into ClpIrDecoder. Nov 7, 2024
#streamReader: ClpIrStreamReader;
#streamReader: ClpStreamReader;

#formatter: Nullable<LogbackFormatter>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a formatter interface might as well use it

Suggested change
#formatter: Nullable<LogbackFormatter>;
#formatter: Nullable<Formatter>;

Comment on lines 96 to 119
results.forEach((result) => {
const [message, timestamp] = result;
let fields: JsonObject = {};
try {
fields = JSON.parse(message) as JsonObject;
if (false === isJsonObject(fields)) {
throw new Error("Unexpected non-object.");
} else if (null !== this.#logLevelKey) {
const logLevel = fields[this.#logLevelKey];
fields[this.#logLevelKey] = LOG_LEVEL[logLevel as LOG_LEVEL];
}
} catch (e) {
if (0 === message.length) {
return;
}
console.error(e, message);
}

// @ts-expect-error `this.#formatter` is certainly non-null
result[0] = this.#formatter.formatLogEvent({
fields: fields,
timestamp: convertToDayjsTimestamp(timestamp),
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a simpler version. I feel like we dont need to get the log level since it should already be in the result. I also dont think we need to check if the length is 0, since i dont think clp-ffi-js would return length 0. I think worst case it will return {} if the log event is empty.

Suggested change
results.forEach((result) => {
const [message, timestamp] = result;
let fields: JsonObject = {};
try {
fields = JSON.parse(message) as JsonObject;
if (false === isJsonObject(fields)) {
throw new Error("Unexpected non-object.");
} else if (null !== this.#logLevelKey) {
const logLevel = fields[this.#logLevelKey];
fields[this.#logLevelKey] = LOG_LEVEL[logLevel as LOG_LEVEL];
}
} catch (e) {
if (0 === message.length) {
return;
}
console.error(e, message);
}
// @ts-expect-error `this.#formatter` is certainly non-null
result[0] = this.#formatter.formatLogEvent({
fields: fields,
timestamp: convertToDayjsTimestamp(timestamp),
});
});
for (const result of results) {
const [message, timestamp, level] = result;
let fields: JsonObject = {};
try {
fields = JSON.parse(message) as JsonObject;
if (false === isJsonObject(fields)) {
throw new Error("Unexpected non-object.");
}
} catch (e) {
console.error(e, message);
}
const dayJsTimestamp: Dayjs = convertToDayjsTimestamp(timestamp);
const logEvent: LogEvent = {
timestamp: dayJsTimestamp,
level: level,
fields: fields
}
result[0] = this.#formatter.formatLogEvent(logEvent);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we dont need to get the log level since it should already be in the result.

Right. For now we shall store log level strings as authoritative log levels in the logging library.

I also dont think we need to check if the length is 0, since i dont think clp-ffi-js would return length 0. I think worst case it will return {} if the log event is empty.

Good catch

@@ -14,13 +14,25 @@ interface LogEventCount {
* @property logLevelKey The key of the kv-pair that contains the log level in every record.
* @property timestampKey The key of the kv-pair that contains the timestamp in every record.
*/
interface JsonlDecoderOptionsType {
interface JsonlDecoderOptions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about just naming this DecoderOptions and get rid of JsonlDecoderOptions, StructuredClpIrDecoderOptions, ClpIrDecoderOptions. I feel like we have all these extra types that serve no purpose. Maybe if we actually need different types in the future can add back?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

({logLevelKey} = decoderOptions);
}

return new ClpIrDecoder(streamReader, formatter, logLevelKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return new ClpIrDecoder(streamReader, formatter, logLevelKey);
return new ClpIrDecoder(streamReader, formatter);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one streamType argument to make the handling more obvious.

module.IRProtocolErrorCode.SUPPORTED === streamReader.getIrProtocolErrorCode()
) {
formatter = new LogbackFormatter({formatString: decoderOptions.formatString});
({logLevelKey} = decoderOptions);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
({logLevelKey} = decoderOptions);


constructor (streamReader: ClpIrStreamReader) {
constructor (streamReader: ClpStreamReader, formatter: Nullable<LogbackFormatter>, logLevelKey: Nullable<string>) {
Copy link
Contributor

@davemarco davemarco Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
constructor (streamReader: ClpStreamReader, formatter: Nullable<LogbackFormatter>, logLevelKey: Nullable<string>) {
constructor (streamReader: ClpStreamReader, formatter: Nullable<Formatter>) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear to me why we need this bigint support in this PR? It seems to work with IRv1 stream reader?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two reasons:

  1. We enabled link flag -sWASM_BIGINT when compiling clp-ffi-js: https://github.com/y-scope/clp-ffi-js/blob/450e8f84f17ec233765f1e79e60984cd8085076e/CMakeLists.txt#L72
  2. The timestamps returned by reader.decodeRange() are indeed BigInt. See TS type of timestamp in decodeRange returned results should be BigInt instead of Number. clp-ffi-js#33 . Also:
    1. In IRv1, timestamps are of type int64_t which maps to BigInt by Emscripten thanks to the above flag; however, we previously have never processed the returned timestamps with Dayjs / at all so Dayjs's inability to deal with BigInt timestamps without extensions was not discovered.
    2. In IRv2, integers are of type int64_t. At the moment, we read all timestamp node values as integers and return directly without any processing.

@junhaoliao junhaoliao changed the title WIP: Add Structured CLP IR stream handler into ClpIrDecoder. Update clp-ffi-js to 0.3.0; Refactor ClpIrDecoder to also support Structured CLP IR streams. Nov 11, 2024
@junhaoliao junhaoliao changed the title Update clp-ffi-js to 0.3.0; Refactor ClpIrDecoder to also support Structured CLP IR streams. Update clp-ffi-js to 0.3.0; Refactor ClpIrDecoder to also support Structured CLP IR streams without log-level filtering. Nov 11, 2024
@junhaoliao junhaoliao changed the title Update clp-ffi-js to 0.3.0; Refactor ClpIrDecoder to also support Structured CLP IR streams without log-level filtering. Update clp-ffi-js to 0.3.0; Refactor ClpIrDecoder to also support Structured CLP IR streams but without log-level filtering. Nov 11, 2024
@junhaoliao junhaoliao requested a review from davemarco November 11, 2024 01:11
@junhaoliao junhaoliao marked this pull request as ready for review November 11, 2024 01:11
@@ -8,20 +8,16 @@ interface LogEventCount {
}

/**
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description intentionally omitted.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (5)
src/services/decoders/JsonlDecoder/utils.ts (1)

Line range hint 56-68: Consider improving type validation clarity

The condition structure could be more readable. Consider extracting the type validation into a separate function.

Here's a suggested refactoring:

 const convertToDayjsTimestamp = (field: JsonValue | bigint | undefined): dayjs.Dayjs => {
-    if (("string" !== typeof field &&
-        "number" !== typeof field &&
-        "bigint" !== typeof field) ||
-        "undefined" === typeof field
-    ) {
+    const isValidTimestampType = (field: JsonValue | bigint | undefined): boolean => {
+        return (typeof field === "string" ||
+                typeof field === "number" ||
+                typeof field === "bigint") &&
+               typeof field !== "undefined";
+    };
+
+    if (!isValidTimestampType(field)) {
         field = INVALID_TIMESTAMP_VALUE;
     }
src/utils/config.ts (1)

134-134: Consider adding type validation for localStorage data.

The type casting assumes that the object structure from localStorage matches DecoderOptions. Consider adding runtime validation to ensure type safety.

Example implementation:

function isValidDecoderOptions(value: unknown): value is DecoderOptions {
    return typeof value === 'object' && value !== null &&
        'formatString' in value && typeof value.formatString === 'string' &&
        'logLevelKey' in value && typeof value.logLevelKey === 'string' &&
        'timestampKey' in value && typeof value.timestampKey === 'string';
}

// Usage in getConfig:
const decoderOptions = {
    formatString: window.localStorage.getItem(
        LOCAL_STORAGE_KEY.DECODER_OPTIONS_FORMAT_STRING
    ),
    logLevelKey: window.localStorage.getItem(
        LOCAL_STORAGE_KEY.DECODER_OPTIONS_LOG_LEVEL_KEY
    ),
    timestampKey: window.localStorage.getItem(
        LOCAL_STORAGE_KEY.DECODER_OPTIONS_TIMESTAMP_KEY
    ),
};

if (!isValidDecoderOptions(decoderOptions)) {
    console.warn('Invalid decoder options in localStorage, using defaults');
    return CONFIG_DEFAULT[CONFIG_KEY.DECODER_OPTIONS];
}

return decoderOptions;
src/services/decoders/JsonlDecoder/index.ts (1)

Line range hint 84-88: Consider using a more specific type for formatter options.

While the change to DecoderOptions maintains consistency, this method only uses the formatString property. Consider creating a more focused type for formatter options to improve type safety and clarity.

interface FormatterOptions {
    formatString: string;
}

setFormatterOptions(options: FormatterOptions): boolean
src/services/LogFileManager/index.ts (1)

113-113: Document breaking changes in public API

The type changes in public methods (create and setFormatterOptions) represent breaking changes that could affect consumers. Please ensure these changes are:

  1. Documented in the changelog
  2. Reflected in the package version bump
  3. Migration guide provided if necessary

Also applies to: 141-141, 164-164

src/services/decoders/ClpIrDecoder.ts (1)

125-126: Avoid using the non-null assertion operator

In decodeRange, the non-null assertion operator ! is used on this.#formatter. While the method ensures this.#formatter is not null based on this.#streamType, it's safer to avoid using ! and instead use an explicit check or TypeScript's type guards to prevent potential runtime errors.

Apply this diff to use an explicit null check:

-            result[0] = this.#formatter!.formatLogEvent({
+            if (this.#formatter !== null) {
+                result[0] = this.#formatter.formatLogEvent({
                   fields: fields,
                   level: level,
                   timestamp: dayJsTimestamp,
                 });
+            }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a168a2a and e0bdb68.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (10)
  • package.json (1 hunks)
  • src/services/LogFileManager/index.ts (4 hunks)
  • src/services/MainWorker.ts (2 hunks)
  • src/services/decoders/ClpIrDecoder.ts (3 hunks)
  • src/services/decoders/JsonlDecoder/index.ts (4 hunks)
  • src/services/decoders/JsonlDecoder/utils.ts (1 hunks)
  • src/typings/config.ts (2 hunks)
  • src/typings/decoders.ts (3 hunks)
  • src/typings/worker.ts (2 hunks)
  • src/utils/config.ts (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/services/MainWorker.ts
🔇 Additional comments (16)
src/typings/config.ts (2)

30-30: Verify handling of existing configurations.

The type change in ConfigMap from JsonlDecoderOptionsType to DecoderOptions might affect stored configurations. Please ensure that:

  1. Existing configurations in localStorage are compatible with the new type
  2. A migration strategy exists if needed

Let's check the usage of this configuration:

#!/bin/bash
# Description: Find configuration storage and loading logic
# Expect: Code that handles decoder options persistence

rg -A 5 "CONFIG_KEY.DECODER_OPTIONS"

1-1: LGTM! The generic type name better reflects its broader usage.

The change from JsonlDecoderOptionsType to DecoderOptions is appropriate as it now represents options for multiple decoder types, including both JSONL and CLP IR streams.

Let's verify that all related imports have been updated consistently:

✅ Verification successful

Let me verify if there are any remaining references to JsonlDecoderOptionsType in the codebase.


The type renaming has been consistently applied across the codebase

The verification confirms that:

  • No references to the old JsonlDecoderOptionsType remain in the codebase
  • The new DecoderOptions interface is properly defined in decoders.ts
  • All usages of the decoder options type now reference the new DecoderOptions type
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining references to JsonlDecoderOptionsType
# Expect: No occurrences of the old type name

rg "JsonlDecoderOptionsType"

Length of output: 4997


Script:

#!/bin/bash
# Search for any remaining references to JsonlDecoderOptionsType
rg "JsonlDecoderOptionsType"

# Also search for any references to the old type in decoders.ts
rg -A 5 "DecoderOptions" "src/typings/decoders.ts"

Length of output: 492

src/services/decoders/JsonlDecoder/utils.ts (1)

56-56: Verify dayjs compatibility with bigint timestamps

The function now accepts bigint timestamps, but we should verify that dayjs properly handles bigint values to prevent potential runtime issues.

Let's verify the dayjs bigint handling:

Also applies to: 61-62

✅ Verification successful

Let me verify the implementation details of the bigint handling to ensure it's properly set up.


Bigint timestamp handling is properly configured

The implementation is correctly set up with:

  • The dayjsBigIntSupport plugin is properly imported and configured in MainWorker.ts before any timestamp conversions
  • The convertToDayjsTimestamp function includes proper type checking for bigint values
  • The dayjs extension is properly initialized using dayjs.extend(dayjsBigIntSupport)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if dayjsBigIntSupport plugin is properly imported and configured
# Expected: Find import and setup of dayjsBigIntSupport plugin

# Search for dayjsBigIntSupport plugin usage
rg -l "dayjsBigIntSupport"

# Search for dayjs plugin configuration
rg "dayjs\.(extend|use)"

Length of output: 248


Script:

#!/bin/bash
# Check the actual implementation of convertToDayjsTimestamp
rg -A 10 "convertToDayjsTimestamp.*=" src/services/decoders/JsonlDecoder/utils.ts

# Check if MainWorker initializes before any timestamp conversions
ast-grep --pattern 'class MainWorker {
  $$$
  constructor($$$) {
    $$$
  }
  $$$
}'

Length of output: 784

src/typings/decoders.ts (3)

15-19: LGTM! Successfully consolidated decoder options.

The unified DecoderOptions interface effectively replaces the specific decoder option types as discussed, maintaining a clean and maintainable type structure.


84-84: LGTM! Method signature updated correctly.

The parameter type update to DecoderOptions maintains consistency with the type consolidation.


107-107: LGTM! Export properly added.

The DecoderOptions type is correctly exported, maintaining alphabetical ordering.

package.json (1)

33-33: Verify the compatibility with clp-ffi-js ^0.3.0

The upgrade from ^0.2.0 to ^0.3.0 may introduce breaking changes as this is a pre-1.0 version. Please ensure:

  1. All breaking changes are addressed in the codebase
  2. The package-lock.json is updated and committed
  3. The dependent PR Update the IR format protocol version to 0.0.1 #30 in clp-ffi-js is merged and released

Consider adding a comment in the package.json explaining the dependency on PR #30 to help future maintainers understand the version constraints.

✅ Verification successful

Let me gather more information about the breaking changes and their impact.


Package update verification completed successfully

The verification shows:

  • package-lock.json is properly updated with version 0.3.0
  • The dependency is used only in ClpIrDecoder.ts for structured CLP IR streams
  • The integration is straightforward with proper initialization and usage patterns

The update appears safe to proceed with, considering:

  • package-lock.json reflects the correct version
  • Usage is well-encapsulated within a single service
  • The implementation aligns with the PR's objective to support Structured CLP IR streams
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the dependency update impact

# Test 1: Check if package-lock.json exists and contains the new version
echo "Checking package-lock.json..."
if [ -f "package-lock.json" ]; then
    rg "clp-ffi-js.*0\.3\.0" package-lock.json
else
    echo "Warning: package-lock.json not found"
fi

# Test 2: Find all imports of clp-ffi-js to verify usage
echo "Checking clp-ffi-js usage..."
rg "from ['\"]clp-ffi-js['\"]|require\(['\"]clp-ffi-js['\"]\)" -g '!{node_modules,dist}/**'

Length of output: 529


Script:

#!/bin/bash
# Check the actual usage of ClpStreamReader in the codebase
echo "Checking ClpStreamReader usage..."
rg "ClpStreamReader" -A 5 -B 5

# Look for any potential configuration or initialization code
echo "Checking clpFfiJsModuleInit usage..."
rg "clpFfiJsModuleInit" -A 5 -B 5

Length of output: 4082

src/typings/worker.ts (2)

4-4: LGTM! Type import update is consistent.

The renaming from DecoderOptionsType to DecoderOptions aligns with the broader refactoring effort and maintains proper import organization.


92-92: Verify worker communication type safety.

The type update looks correct. However, since this interface is crucial for worker communication, we should verify that all message handlers are updated accordingly.

✅ Verification successful

Type safety in worker communication is properly maintained

The verification shows that the DecoderOptions type is consistently used throughout the codebase:

  • Properly defined in src/typings/decoders.ts
  • Correctly used in worker message handling in MainWorker.ts
  • Consistently implemented in both JsonlDecoder and ClpIrDecoder
  • Properly passed through the configuration system

All message handlers and decoders are using the type correctly, maintaining type safety in the worker communication pipeline.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all worker message handlers use the new DecoderOptions type

# Test: Search for LOAD_FILE message handling to ensure type consistency
rg -A 5 "WORKER_REQ_CODE.LOAD_FILE" 

# Test: Search for any remaining references to the old type
rg "DecoderOptionsType"

Length of output: 1273


Script:

#!/bin/bash
# Let's verify the DecoderOptions type usage and implementation

# Check the definition and usage of DecoderOptions
ast-grep --pattern 'type DecoderOptions = $_'

# Search for any decoder option related types and interfaces
rg -g '*.ts' -g '*.tsx' 'type.*Decoder|interface.*Decoder' -A 3

# Look for actual usage of decoder options in the codebase
rg -g '*.ts' -g '*.tsx' 'decoderOptions' -B 2 -A 2

Length of output: 7299

src/utils/config.ts (1)

9-9: LGTM: Import statement update is consistent with type standardization.

The change from DecoderOptionsType to DecoderOptions aligns with the broader refactoring effort.

src/services/decoders/JsonlDecoder/index.ts (3)

7-7: LGTM: Import statement is correctly updated.

The addition of DecoderOptions to the imports aligns with the type system changes.


28-30: LGTM: Documentation is clear and accurate.

The JSDoc comment properly references DecoderOptions and clearly explains the purpose of the decoder.


Line range hint 52-57: Verify type compatibility with existing code.

The constructor now uses DecoderOptions instead of JsonlDecoderOptionsType. While the implementation looks correct, we should verify that all required properties (logLevelKey, timestampKey, formatString) are properly defined in the new type.

✅ Verification successful

Based on the shell script results, I can now generate the final response since I have verified the type compatibility:

Type compatibility verified - change is safe

The DecoderOptions interface in src/typings/decoders.ts contains all the required properties (formatString, logLevelKey, timestampKey) that were previously used with JsonlDecoderOptionsType. Additionally, there are no remaining references to JsonlDecoderOptionsType in the codebase, confirming a complete migration to the new type.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that DecoderOptions type contains all required properties
# and check for any direct usage of JsonlDecoderOptionsType in the codebase

# Check the definition of DecoderOptions
echo "Checking DecoderOptions definition:"
rg -A 10 "type DecoderOptions|interface DecoderOptions"

# Look for any remaining references to JsonlDecoderOptionsType
echo "Checking for remaining JsonlDecoderOptionsType references:"
rg "JsonlDecoderOptionsType"

Length of output: 829

src/services/LogFileManager/index.ts (2)

147-147: Verify ClpIrDecoder implementation compatibility

The ClpIrDecoder.create() call has been updated to use the new options type. Please ensure that the ClpIrDecoder class has been updated to handle the new DecoderOptions type correctly.

#!/bin/bash
# Check ClpIrDecoder implementation
ast-grep --pattern 'class ClpIrDecoder {
  $$$
  static create($_: $_) {
    $$$
  }
  $$$
}'

4-4: Verify removal of old type import

Please ensure that the old DecoderOptionsType import has been removed to prevent confusion and maintain clean imports.

✅ Verification successful

No old type import found, changes are clean

The search results confirm that:

  1. There are no remaining references to DecoderOptionsType in the codebase
  2. The new DecoderOptions interface is properly defined in src/typings/decoders.ts
  3. All usages of the type have been consistently updated across the codebase
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining references to DecoderOptionsType
rg "DecoderOptionsType" --type ts

Length of output: 95


Script:

#!/bin/bash
# Let's check for any imports of DecoderOptionsType in the entire codebase
rg "import.*DecoderOptionsType" --type ts

# Also check for any type declarations or interfaces named DecoderOptionsType
ast-grep --pattern 'type DecoderOptionsType = $_'
ast-grep --pattern 'interface DecoderOptionsType $_'

# Let's also verify the new DecoderOptions type is properly used
rg "DecoderOptions" --type ts -A 2

Length of output: 4433

src/services/decoders/ClpIrDecoder.ts (1)

88-90: ⚠️ Potential issue

Add null check for 'this.#formatter' in 'setFormatterOptions'

When streamType is UNSTRUCTURED, this.#formatter is set to null in the constructor. In setFormatterOptions, attempting to instantiate a new LogbackFormatter when this.#formatter is null may not be intended. Consider adding a null check to ensure that this.#formatter is not null before updating it.

Apply this diff to add the null check:

 setFormatterOptions (options: DecoderOptions): boolean {
+    if (this.#formatter === null) {
+        return true;
+    }
     this.#formatter = new LogbackFormatter({formatString: options.formatString});
     return true;
 }

Likely invalid or redundant comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
src/services/decoders/ClpIrDecoder.ts (3)

16-19: Consider moving shared utilities to a common location.

The utility functions convertToDayjsTimestamp and isJsonObject are imported from JsonlDecoder/utils. Since these are now used by multiple decoders, consider moving them to a shared utilities directory (e.g., src/utils/decoders) to improve maintainability and reduce duplication.


49-51: Enhance method documentation.

The JSDoc comment could be more descriptive about the decoderOptions parameter. Consider documenting the expected properties and their impact on the decoder's behaviour.


98-133: Consider using immutable operations in decodeRange.

The current implementation mutates the result array by directly modifying result[0]. Consider creating a new array with transformed results to improve code maintainability and reduce potential side effects.

Apply this diff:

-        results.forEach((result) => {
+        return results.map((result) => {
             const [
                 message,
                 timestamp,
                 level,
             ] = result;
             const dayJsTimestamp: Dayjs = convertToDayjsTimestamp(timestamp);
             let fields: JsonObject = {};

             try {
                 fields = JSON.parse(message) as JsonObject;
                 if (false === isJsonObject(fields)) {
                     throw new Error("Unexpected non-object.");
                 }
             } catch (e) {
                 console.error(e, message);
             }

-            // `this.#formatter` has been null-checked at the method entry.
-            // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
-            result[0] = this.#formatter!.formatLogEvent({
+            return [
+                this.#formatter!.formatLogEvent({
                     fields: fields,
                     level: level,
                     timestamp: dayJsTimestamp,
-            });
+                }),
+                result[1],
+                result[2]
+            ] as DecodeResult;
-        });
-
-        return results;
+        });
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e0bdb68 and 479fd71.

📒 Files selected for processing (3)
  • src/services/decoders/ClpIrDecoder.ts (2 hunks)
  • src/services/decoders/JsonlDecoder/index.ts (7 hunks)
  • src/typings/decoders.ts (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/services/decoders/JsonlDecoder/index.ts
  • src/typings/decoders.ts
🔇 Additional comments (2)
src/services/decoders/ClpIrDecoder.ts (2)

22-25: LGTM! Clear and well-defined enum.

The enum effectively distinguishes between structured and unstructured stream types with self-documenting values.


34-43: ⚠️ Potential issue

Add validation for decoderOptions.formatString.

When initializing the formatter for structured streams, ensure decoderOptions.formatString is defined to prevent potential runtime errors.

Apply this diff:

-        this.#formatter = (streamType === CLP_IR_STREAM_TYPE.STRUCTURED) ?
-            new LogbackFormatter({formatString: decoderOptions.formatString}) :
-            null;
+        this.#formatter = (streamType === CLP_IR_STREAM_TYPE.STRUCTURED) ?
+            new LogbackFormatter({
+                formatString: decoderOptions.formatString ?? '%d{ISO8601} [%p] %m%n'
+            }) :
+            null;

Likely invalid or redundant comment.

}

/**
* Creates a new ClpIrDecoder instance.
*
* @param dataArray The input data array to be passed to the decoder.
* @param decoderOptions The options are only effective if the stream type is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant suggest but

    /**
     * Creates a new ClpIrDecoder instance.
     *
     * @param dataArray The input data array to be passed to the decoder.
     * @param decoderOptions
     * @return The created ClpIrDecoder instance.
     *
     * NOTE: `decoderOptions` only affects decode results if the stream type is
     * type is {@link CLP_IR_STREAM_TYPE.STRUCTURED}.
     */

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the NOTE above the parameter list just so jsdoc doesn't get confused by thinking it's part of the "return".

console.error(e, message);
}

// `this.#formatter` has been null-checked at the method entry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The for of loop fixed this weird typescript error, but if you like for each better, I think they are similiar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, having the check in the callback confuses TS. In favour of getting rid of the eslint override i agree it's better to use a for-loop.

@@ -53,12 +53,13 @@ const convertToLogLevelValue = (field: JsonValue | undefined): LOG_LEVEL => {
* - the timestamp's value is an unsupported type.
* - the timestamp's value is not a valid dayjs timestamp.
*/
const convertToDayjsTimestamp = (field: JsonValue | undefined): dayjs.Dayjs => {
const convertToDayjsTimestamp = (field: JsonValue | bigint | undefined): dayjs.Dayjs => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi i just noticed i dont think json.parse will work with big int? Like it will throw exception since it is not a json value? How do you feel about using this? There is also this hack i saw on github. I feel like in practice we won't see these weird timestamps, but wtv u think is best.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also how do u feel about this file to utils? Since it is shared by both IR and Json now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right that's an issue... Just to understand why we need the mitigation, where are we using json.parse with BigInts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially also in json decoder if logs have an bigint?

Copy link
Member Author

@junhaoliao junhaoliao Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg...

  1. nlohmann/json acknowledges there are implementation whose number types can store up to 2^53-1 : https://json.nlohmann.me/features/types/number_handling/#number-interoperability
  2. yet it does serialize numbers out of that range, imo for good reasons though: https://github.com/nlohmann/json/blob/4a602df34e71afca2242b34c743d4cc2486cd071/tests/src/unit-serialization.cpp#L168-L183

funny finding: https://www.github.com/nlohmann/json/issues/1708 -> https://www.github.com/nlohmann/json/pull/1722

Let's discuss the issue with the clp core people before we make changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, calling JSON.parse() with any clp-ffi-js-returned JSON string containing number in +/- 2^53-1 ~ 2^64-1 would encounter loss of precision, but luckily nothing throws.
Let's address the loss of precision issue in another PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be tracked at #116

@junhaoliao junhaoliao requested a review from davemarco November 15, 2024 15:50
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (5)
src/services/decoders/ClpIrDecoder.ts (5)

22-25: Add JSDoc documentation for the CLP_IR_STREAM_TYPE enum.

Consider adding documentation to explain the purpose and usage of each stream type for better maintainability.

+/**
+ * Enum representing the types of CLP IR streams that can be processed.
+ * @enum {string}
+ */
 enum CLP_IR_STREAM_TYPE {
+    /** Structured logs containing JSON-formatted messages */
     STRUCTURED = "structured",
+    /** Unstructured logs containing plain text messages */
     UNSTRUCTURED = "unstructured",
 }

34-43: Add parameter validation in the constructor.

The constructor should validate its parameters to ensure robust initialization.

 constructor (
     streamType: CLP_IR_STREAM_TYPE,
     streamReader: ClpStreamReader,
     decoderOptions: DecoderOptions
 ) {
+    if (!Object.values(CLP_IR_STREAM_TYPE).includes(streamType)) {
+        throw new Error(`Invalid stream type: ${streamType}`);
+    }
+    if (!streamReader) {
+        throw new Error('StreamReader is required');
+    }
     this.#streamType = streamType;
     this.#streamReader = streamReader;
     this.#formatter = (streamType === CLP_IR_STREAM_TYPE.STRUCTURED) ?
         new LogbackFormatter({formatString: decoderOptions.formatString}) :
         null;
 }

89-91: Add validation in setFormatterOptions method.

The method should validate the formatString option before creating a new formatter.

 setFormatterOptions (options: DecoderOptions): boolean {
+    if (!options?.formatString) {
+        console.error('Invalid formatter options: formatString is required');
+        return false;
+    }
     this.#formatter = new LogbackFormatter({formatString: options.formatString});
     return true;
 }

103-111: Enhance error handling for missing formatter.

The error message could be more descriptive and include troubleshooting information.

 if (null === this.#formatter) {
     if (this.#streamType === CLP_IR_STREAM_TYPE.STRUCTURED) {
-        // eslint-disable-next-line no-warning-comments
-        // TODO: Revisit when we allow displaying structured logs without a formatter.
-        console.error("Formatter is not set for structured logs.");
+        console.error(
+            "Formatter is not set for structured logs. " +
+            "Please ensure setFormatterOptions is called before decoding structured logs."
+        );
     }
     return results;
 }

122-129: Improve error handling for JSON parsing.

Consider adding more context to the error message and handling specific JSON parsing errors differently.

 try {
     fields = JSON.parse(message) as JsonObject;
     if (false === isJsonObject(fields)) {
         throw new Error("Unexpected non-object.");
     }
 } catch (e) {
-    console.error(e, message);
+    if (e instanceof SyntaxError) {
+        console.error(`Invalid JSON format in message: ${message}`, e);
+    } else {
+        console.error(`Error processing message: ${message}`, e);
+    }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 479fd71 and 023e345.

📒 Files selected for processing (1)
  • src/services/decoders/ClpIrDecoder.ts (2 hunks)
🔇 Additional comments (1)
src/services/decoders/ClpIrDecoder.ts (1)

1-19: LGTM! Imports are well-organized and properly utilized.

The imports are logically grouped and all imported entities are used within the implementation.

@junhaoliao junhaoliao changed the title Update clp-ffi-js to 0.3.0; Refactor ClpIrDecoder to also support Structured CLP IR streams but without log-level filtering. feat(decoder): Add support for Structured CLP IR streams but without log-level filtering. Nov 15, 2024
@junhaoliao junhaoliao merged commit 978b0af into y-scope:main Nov 15, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants