-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use error messages from jcc err.log in experiments #230
base: main
Are you sure you want to change the base?
Conversation
7de4eb5
to
844aef5
Compare
844aef5
to
caf19de
Compare
Special cases when get_jcc_errstr() cannot find error message but it should:
In all other projects, get_jcc_errstr() can return desired error messages, especially linker errors, or there's build issue from the project source. (In other words, fuzz target is never touched by jcc and error messages from source are not returned mistakenly) |
Thanks for documenting this. However, could you please try building the original fuzz target with JCC and see if that fails? |
To clarify, For testing, in
Then Below is a list of projects found not building in oss-fuzz-gen, and whether they build with jcc (7c67544) in oss-fuzz:
|
105bac4
to
e9d5283
Compare
llm_toolkit/code_fixer.py
Outdated
# Assume the default output name. | ||
return 'a.out' | ||
return os.path.basename(output_name) | ||
return '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Exclude the simple cases first to avoid nested conditions, e.g.,
if not target_found:
return ''
- Log a warning when the target is not found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please simplify this function, e.g.,
for i, arg in enumerate(compile_args):
if arg in ['-o', '--output'] and i < len(compile_args) - 1:
output_name = compile_args[i + 1]
elif arg.startswith('--output='):
output_name = arg.removeprefix('--output=')
elif not arg.startswith('-') and os.path.basename(arg) in target_names:
target_found = os.path.basename(arg)
if not target_found:
return ''
if output_name:
return os.path.basename(output_name)
if '-c' in compile_args:
return f'{os.path.splitext(target_found)[0]}.o'
logging.warning(
'Output file not specified in [%s], but fuzz target found',
' '.join(compile_args))
return 'a.out'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion.
Re 2.
Probably better to not log a warning when the target is not found. It's expected to have many such cases where the command is not compiling the fuzz target, indicating the log lines followed are not the target error lines we want.
Rename variable instead: target_found
-> fuzz_target_found
e9d5283
to
312a029
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the detailed review Dongge :)
llm_toolkit/code_fixer.py
Outdated
# Assume the default output name. | ||
return 'a.out' | ||
return os.path.basename(output_name) | ||
return '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion.
Re 2.
Probably better to not log a warning when the target is not found. It's expected to have many such cases where the command is not compiling the fuzz target, indicating the log lines followed are not the target error lines we want.
Rename variable instead: target_found
-> fuzz_target_found
except FileNotFoundError as e: | ||
logging.error('Cannot get err.log for %s: %s', generated_project, e) | ||
# Touch err.log in results folder to avoid FileNotFoundError when | ||
# extracting errors. | ||
open(jcc_errlog_path, 'x') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better solution than this?
I guess this intends to make parsing easier, but it may confuse us to think JCC created an empty file. We will have to search in gcloud logs to distinguish these two cases.
Could we check os.path.isfile()
at the beginning extract_error_message()
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a magic string in err.log when err.log does not exist?
This solution is for local experiment, creating an empty file is currently consistent with the behaviour of getting the build log and run log. Might help prevent increasing the complexity of build_and_run() workflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a magic string in err.log when err.log does not exist?
This can work but not preferred.
Intuitively, if err.log
is not generated, then it should not exist.
creating an empty file is currently consistent with the behaviour of getting the build log and run log
Where do we create empty build log and run logs when they do not exist?
Might help prevent increasing the complexity of build_and_run() workflow?
Not sure if this relates, but I suppose this will only add two lines in code_fixer
?
Could we check os.path.isfile() at the beginning extract_error_message() instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we create empty build log and run logs?
It's in cloud builder, we open the local file object before checking if the file blob on cloud exists
https://github.com/google/oss-fuzz-gen/blob/main/experiment/builder_runner.py#L645
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see; let's keep it, then.
Thanks!
elif arg.startswith('-o'): | ||
output_name = arg.removeprefix('-o') | ||
elif (not arg.startswith('-') and not arg == output_name and | ||
os.path.basename(arg) in target_names): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A silly question:
Why do we need not arg == output_name
?
Also, would it be a good idea to log a warning if arg in ['-o', '--output']
but i + 1 >= len(compile_args)
?
This is unlikely to happen now but may occur later when we automate the build script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need
not arg == output_name
?
Since we assigned output_name = compile_args[i + 1]
in the previous iteration, we want to skip it now.
I'm thinking of the situation when we have clang src.c -o src.o
, and we added src.o
to target_names
for search.
Then it gets compiled again clang++ src.cpp -o src.o
. We dont want to assign
src.oto
fuzz_target_found`.
Although this should be ok because we will not use it later on.
log a warning if
i + 1 >= len(compile_args)
Sure good point
312a029
to
4d7b6e5
Compare
Thanks for addressing the comments, I have no more suggestions now. |
Many thanks Dongge :) |
/gcbrun request_pr_exp.py -n jim -f |
nit:
nit:Could you please add the branch id to the name in the future? |
4d7b6e5
to
8f4e841
Compare
Before merging:
This pr adds support for extracting error messages from jcc's err.log