Report Generation Agent: Adding Trajectory Evaluation#27
Merged
Conversation
commit b4e124d Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 29 17:00:58 2026 -0500 Small fixes, additional logging and updated groud truth commit 7d59004 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 16:30:23 2026 -0500 Upgrading python-multipart + small improvements commit 2906b36 Merge: 285591b bba7326 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 16:12:03 2026 -0500 Merge branch 'main' into marcelo/langfuse-integration commit 285591b Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 16:09:16 2026 -0500 Adding readme instructions commit 37348c0 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 15:53:47 2026 -0500 Minor improvements commit 9fdc71d Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 15:46:25 2026 -0500 Addingh evaluator and retry mechanism commit 5af7152 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 14:41:36 2026 -0500 Using langfuse to upload a dataset and run the evaluation commit c1980fe Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 12:39:33 2026 -0500 Adding the eval dataset and making changes to the eval script. Adding tenacity for retrying mechanism commit 02c3ac5 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 16:57:37 2026 -0500 Added code comments commit da9b0c9 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 16:50:06 2026 -0500 Finished using LLMs to evaluate result commit f0af403 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 13:51:06 2026 -0500 Moving forward with the evaluation script + some more refactorings commit 93ee157 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 11:36:52 2026 -0500 Reporting to langfuse and removed clutter commit d029285 Merge: a39ac1d 9549395 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 11:00:28 2026 -0500 Merge branch 'main' into marcelo/langfuse-integration commit a39ac1d Merge: cdf0647 efd80cb Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 17:09:09 2026 -0500 Merge branch 'marcelo/report-agent' into marcelo/langfuse-integration commit efd80cb Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 17:03:37 2026 -0500 CR by Franklin commit 7a2a57f Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 16:31:49 2026 -0500 CR by Franklin commit cdf0647 Merge: 53d0589 534f8e5 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 16:25:19 2026 -0500 Merge branch 'marcelo/report-agent' into marcelo/langfuse-integration commit 534f8e5 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 16:19:30 2026 -0500 CR by Franklin commit 53d0589 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 16:07:33 2026 -0500 Some more langfuse things commit 40dfc6f Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 13:42:41 2026 -0500 Parsing client responses into langfuse traces commit 20e4ec5 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 11:42:38 2026 -0500 Small refactor commit ee8b854 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 11:36:14 2026 -0500 Moving env and logging config to the top of the file commit 66a4494 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 11:13:05 2026 -0500 CR by Amrit commit f9d7862 Merge: dc02ff2 9042ace Author: Marcelo Lotif <lotif@users.noreply.github.com> Date: Mon Jan 26 11:12:42 2026 -0500 Merge branch 'main' into marcelo/report-agent commit dc02ff2 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 23 12:56:56 2026 -0500 Grammar fixes commit 530360e Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 23 12:42:50 2026 -0500 Adding a couple more vulnerabilities to the skip list commit 7bb081f Merge: 6e3c4c2 bd34ef0 Author: Marcelo Lotif <lotif@users.noreply.github.com> Date: Fri Jan 23 12:37:19 2026 -0500 Merge branch 'main' into marcelo/report-agent commit 6e3c4c2 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 23 12:35:08 2026 -0500 One more readme paragraph commit 37b4000 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 23 12:27:23 2026 -0500 Movign files around, adding the ddl file and the import script commit 3458565 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 22 14:39:47 2026 -0500 Generating xlsx reports commit 22fc569 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 22 12:28:40 2026 -0500 Adding more report examples commit 6592a1c Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 22 11:55:41 2026 -0500 Deleting weaviate stuff, using Online Retail dataset instead commit 0098f7d Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 22 11:37:51 2026 -0500 Weaviate local and remote scripts commit 9e6ce2e Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 21 11:47:00 2026 -0500 Adding data import for the online retail dataset and some more instructions commit a77a60f Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 16 17:56:15 2026 -0500 WIp trying to make it work
commit 4507d52 Merge: b4e124d 412298a Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 30 16:58:09 2026 -0500 Merge branch 'main' into marcelo/langfuse-integration commit 412298a Author: Amrit Krishnan <amrit110@gmail.com> Date: Fri Jan 30 13:29:20 2026 -0500 Feature/knowledge agent (#18) * Add initial working implementation using search grounding * [pre-commit.ci] Add auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove example implementation * Fix GHSA-wp53-j4wj-2cfg, pin python-multipart version * Update agent to ReAct, fix grounding tool * Update README.md * Add tracing to langfuse * Clear notebook cells * Remove python-multipart as direct dependency and only update it * Remove D103 and E402 from being ignored in pre-commit check and fix notebooks * Move imports to top of the file * Simplify tracing module to just read directly from env variables * Rename async client manager for agent, reuse existing async client manager for tracing * Clarify optional dataset variable in docstring * Fix format_response_with_citations * Return results instead of modifying input params * Use pydantic native desc docstring instead of numpy style * Unify config to use same across agents * Use ADK's session management, remove custom implementation * Remove weaviate from client manager --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit b4e124d Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 29 17:00:58 2026 -0500 Small fixes, additional logging and updated groud truth commit 7d59004 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 16:30:23 2026 -0500 Upgrading python-multipart + small improvements commit 2906b36 Merge: 285591b bba7326 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 16:12:03 2026 -0500 Merge branch 'main' into marcelo/langfuse-integration commit 285591b Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 16:09:16 2026 -0500 Adding readme instructions commit 37348c0 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 15:53:47 2026 -0500 Minor improvements commit 9fdc71d Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 15:46:25 2026 -0500 Addingh evaluator and retry mechanism commit 5af7152 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 14:41:36 2026 -0500 Using langfuse to upload a dataset and run the evaluation commit c1980fe Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 28 12:39:33 2026 -0500 Adding the eval dataset and making changes to the eval script. Adding tenacity for retrying mechanism commit 02c3ac5 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 16:57:37 2026 -0500 Added code comments commit da9b0c9 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 16:50:06 2026 -0500 Finished using LLMs to evaluate result commit f0af403 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 13:51:06 2026 -0500 Moving forward with the evaluation script + some more refactorings commit 93ee157 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 11:36:52 2026 -0500 Reporting to langfuse and removed clutter commit d029285 Merge: a39ac1d 9549395 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Tue Jan 27 11:00:28 2026 -0500 Merge branch 'main' into marcelo/langfuse-integration commit a39ac1d Merge: cdf0647 efd80cb Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 17:09:09 2026 -0500 Merge branch 'marcelo/report-agent' into marcelo/langfuse-integration commit efd80cb Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 17:03:37 2026 -0500 CR by Franklin commit 7a2a57f Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 16:31:49 2026 -0500 CR by Franklin commit cdf0647 Merge: 53d0589 534f8e5 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 16:25:19 2026 -0500 Merge branch 'marcelo/report-agent' into marcelo/langfuse-integration commit 534f8e5 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 16:19:30 2026 -0500 CR by Franklin commit 53d0589 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 16:07:33 2026 -0500 Some more langfuse things commit 40dfc6f Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 13:42:41 2026 -0500 Parsing client responses into langfuse traces commit 20e4ec5 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 11:42:38 2026 -0500 Small refactor commit ee8b854 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 11:36:14 2026 -0500 Moving env and logging config to the top of the file commit 66a4494 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Mon Jan 26 11:13:05 2026 -0500 CR by Amrit commit f9d7862 Merge: dc02ff2 9042ace Author: Marcelo Lotif <lotif@users.noreply.github.com> Date: Mon Jan 26 11:12:42 2026 -0500 Merge branch 'main' into marcelo/report-agent commit dc02ff2 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 23 12:56:56 2026 -0500 Grammar fixes commit 530360e Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 23 12:42:50 2026 -0500 Adding a couple more vulnerabilities to the skip list commit 7bb081f Merge: 6e3c4c2 bd34ef0 Author: Marcelo Lotif <lotif@users.noreply.github.com> Date: Fri Jan 23 12:37:19 2026 -0500 Merge branch 'main' into marcelo/report-agent commit 6e3c4c2 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 23 12:35:08 2026 -0500 One more readme paragraph commit 37b4000 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 23 12:27:23 2026 -0500 Movign files around, adding the ddl file and the import script commit 3458565 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 22 14:39:47 2026 -0500 Generating xlsx reports commit 22fc569 Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 22 12:28:40 2026 -0500 Adding more report examples commit 6592a1c Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 22 11:55:41 2026 -0500 Deleting weaviate stuff, using Online Retail dataset instead commit 0098f7d Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Thu Jan 22 11:37:51 2026 -0500 Weaviate local and remote scripts commit 9e6ce2e Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Wed Jan 21 11:47:00 2026 -0500 Adding data import for the online retail dataset and some more instructions commit a77a60f Author: Marcelo Lotif <marcelo.lotif@vectorinstitute.ai> Date: Fri Jan 16 17:56:15 2026 -0500 WIp trying to make it work
amrit110
approved these changes
Feb 2, 2026
Member
amrit110
left a comment
There was a problem hiding this comment.
Looks good to me. Haven't tried it but the code looks fine.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adding an agent for trajectory evaluation + some minor refactors.
For an example of how the evals are looking like, please see:
https://us.cloud.langfuse.com/project/cmkwsswke005dad07gxujnipq/datasets/cml5fp1y305luad06kocwn33q/runs/4a3bd97b-99ed-41ee-ad9e-01ec6d3f179e
Clickup Ticket(s): NA
Type of Change
Changes Made
prompts.pyevaluate.pyscriptTesting
uv run pytest tests/)uv run mypy <src_dir>)uv run ruff check src_dir/)Manual testing details:
Executed the import data and evaluate scripts
Checklist