-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MedHELM V1 #3403
MedHELM V1 #3403
Conversation
…ine, change bertscore backbone model to fit on 40GB GPU
GITHUB_DIR_URL = "https://github.com/raulista1997/benchmarkdata/tree/main/mtsamples_processed" | ||
RAW_BASE_URL = "https://raw.githubusercontent.com/raulista1997/benchmarkdata/refs/heads/main/mtsamples_processed/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pin githash.
soup = BeautifulSoup(response.text, "html.parser") | ||
file_links = [ | ||
link.text | ||
for link in soup.find_all( | ||
"a", {"href": re.compile(r"/raulista1997/benchmarkdata/blob/main/mtsamples_processed/.*\.txt$")} | ||
) | ||
] | ||
return file_links |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the GitHub API and pin githash (same comments as for mtsamples_procedures_scenario
)
@@ -1958,6 +1976,15 @@ models: | |||
num_parameters: 14000000000 | |||
release_date: 2024-05-21 | |||
tags: [TEXT_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG] | |||
|
|||
- name: microsoft/phi-3.5-mini-instruct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
microsoft/phi-3.5-mini-instruct
already exists; remove.
src/helm/config/model_metadata.yaml
Outdated
release_date: 2024-09-25 | ||
tags: [TEXT_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG] | ||
|
||
- name: meta/llama-3.1-8b-instruct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meta/llama-3.1-8b-instruct
already exists; delete
src/helm/config/model_metadata.yaml
Outdated
@@ -1530,6 +1530,24 @@ models: | |||
release_date: 2022-12-22 | |||
tags: [] # TODO: add tags | |||
|
|||
- name: meta/llama-3.2-1b-instruct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to right before the entry for meta/llama-3.2-3b-instruct-turbo
.
get_instructions, | ||
extract_patient_id_from_fname, | ||
get_ehrs, | ||
get_tokenizer, | ||
tag_rgx_expression, | ||
fetch_nodes_with_tag, | ||
cast_dtype, | ||
check_condition, | ||
check_all_conditions, | ||
remove_node, | ||
query_xml_str, | ||
filter_events, | ||
retrieve_most_relevant_visits, | ||
get_prompt_template, | ||
pack_and_trim_prompts, | ||
preprocess_prompts, | ||
add_reference_responses, | ||
return_dataset_dataframe, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you only need to import return_dataset_dataframe
.
# get the patient EHR selected for this instruction | ||
pt_id: Union[str, int] = instruction_dict["patient_id"] | ||
relevant_ehr = ehrs[pt_id] # type: ignore | ||
prompt = PassageQuestionInput(passage="", question=question) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping.
@staticmethod | ||
def get_date_of_note(patient: Dict[str, Any], note_idx: int) -> str: | ||
"""Get date of note for patient""" | ||
if not isinstance(note_idx, int): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eval()
is insecure code. Please either use int()
instead, or remove this if block.
cursor.execute(ground_truth_sql) | ||
fetched_result = cursor.fetchone() | ||
if fetched_result: | ||
# Convert extra_values to match SQLite's expected types | ||
converted_values = [ | ||
type(fetched_result[i])(extra_values[i]) for i in range(len(extra_values)) | ||
] | ||
ground_truth_result = converted_values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually work? If we're in this block, then it means that cursor.fetchall()
returned a false-y value or that the query failed, so re-running the query should also result in failure. I'm fine with just using extra_values
as is (i.e. the original verison).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you all!
On this PR, we add the 31 scenarios part of the first release of MedHELM and the model deployments used to run all benchmarks. Changes checklist:
src/helm/benchmark/scenarios
medhelm_run_specs.py