Skip to content

Commit 99da2de

Browse files
committed
add changes based on feedback
Signed-off-by: Jason Tsay <[email protected]>
1 parent b57bf7b commit 99da2de

File tree

6 files changed

+120
-29
lines changed

6 files changed

+120
-29
lines changed

plugins/altk_json_processor/README.md

Lines changed: 25 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,16 @@
55
66
Uses JSON Processor from ALTK to extract data from long JSON responses. See the [ALTK](https://altk.ai/) and the [JSON Processor component in the ALTK repo](https://github.com/AgentToolkit/agent-lifecycle-toolkit/tree/main/altk/post_tool/code_generation) for more details on how the component works.
77

8+
Note that this plugin will require calling an LLM and will therefore require configuring an LLM provider as described below. This plugin will also incure some cost in terms of time and money to do its LLM calls. This can be adjusted via the length threshold in the configuration, such that the plugin only activates and calls an LLM on JSON responses of a particular length (default: 100,000 characters).
9+
810
## Hooks
911
- `tool_post_invoke` - Detects long JSON responses and processes as necessary
1012

1113
## Installation
1214

13-
1. Copy .env.example .env
14-
2. Enable plugins in `.env`
15-
3. Enable the "ALTKJsonProcessor" plugin in `plugins/config.yaml`.
16-
4. Install the optional dependency `altk` (i.e. `pip install mcp-context-forge[altk]`)
15+
1. Enable the "ALTKJsonProcessor" plugin in `plugins/config.yaml`.
16+
2. Install the optional dependency `altk` (i.e. `pip install mcp-context-forge[altk]`)
17+
3. Configure a LLM provider as described below.
1718

1819
## Configuration
1920

@@ -29,7 +30,7 @@ Uses JSON Processor from ALTK to extract data from long JSON responses. See the
2930
config:
3031
jsonprocessor_query: ""
3132
llm_provider: "watsonx" # one of watsonx, ollama, openai, anthropic
32-
watsonx:
33+
watsonx: # each section of providers is optional
3334
wx_api_key: "" # optional, can define WX_API_KEY instead
3435
wx_project_id: "" # optional, can define WX_PROJECT_ID instead
3536
wx_url: "https://us-south.ml.cloud.ibm.com"
@@ -45,3 +46,22 @@ Uses JSON Processor from ALTK to extract data from long JSON responses. See the
4546
4647
- `length_threshold` is the minimum number of characters before activating this component
4748
- `jsonprocessor_query` is a natural language statement of what the long response should be processed for. For an example of a long response for a musical artist: "get full metadata for all albums from the artist's discography in json format"
49+
50+
### LLM Provider Configuration
51+
52+
In the configuration, select an LLM Provider via `llm_provider`, the current options are WatsonX, Ollama, OpenAI, or Anthropic.
53+
Then fill out the corresponding provider section in the plugin config. For many of the api key-related fields, an environment variable
54+
can also be used instead. If the field is set in both the plugin config and in an environment variable, the plugin config takes priority.
55+
56+
### JSON Processor Query
57+
58+
To guide the JSON Processor, an optional but recommended `jsonprocessor_query` can be provided that is a natural language statement of what the long response should be processed for.
59+
60+
Example queries:
61+
62+
- For an API endpoint such as [this Spotify artist overview](https://rapidapi.com/DataFanatic/api/spotify-scraper/playground/apiendpoint_fd33b4eb-d258-437e-af85-c244904acefc) that returns a large response, if you only want the discography of the artist, use a query such as: "get full metadata for all albums from the artist's discography in json format"
63+
- For a shopping API endpoint that returns a [response like this](https://raw.githubusercontent.com/AgentToolkit/agent-lifecycle-toolkit/refs/heads/main/examples/codegen_long_response_example.json), if you only want the sizes of hte sneakers, use a query such as: "get the sizes for all products"
64+
65+
## Testing
66+
67+
Unit tests: `tests/unit/mcpgateway/plugins/plugins/altk_json_processor/test_json_processor.py`

plugins/altk_json_processor/json_processor.py

Lines changed: 35 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -71,21 +71,24 @@ async def tool_post_invoke(self, payload: ToolPostInvokePayload, context: Plugin
7171
if len(self._cfg["watsonx"]["wx_api_key"]) > 0:
7272
api_key = self._cfg["watsonx"]["wx_api_key"]
7373
else:
74-
# Note that we assume here this env var exists and should throw an error if not
75-
api_key = os.environ["WX_API_KEY"]
74+
api_key = os.getenv("WX_API_KEY")
75+
if not api_key:
76+
raise ValueError("WatsonX api key not found, provide WX_API_KEY either in the plugin config or as an env var.")
7677
if len(self._cfg["watsonx"]["wx_project_id"]) > 0:
7778
project_id = self._cfg["watsonx"]["wx_project_id"]
7879
else:
79-
# Note that we assume here this env var exists and should throw an error if not
80-
project_id = os.environ["WX_PROJECT_ID"]
80+
project_id = os.getenv("WX_PROJECT_ID")
81+
if not project_id:
82+
raise ValueError("WatsonX project id not found, project WX_PROJECT_ID either in the plugin config or as an env var.")
8183
llm_client = watsonx_client(model_id=self._cfg["model_id"], api_key=api_key, project_id=project_id, url=self._cfg["watsonx"]["wx_url"])
8284
elif provider == "openai":
8385
openai_client = get_llm("openai.sync")
8486
if len(self._cfg["openai"]["api_key"]) > 0:
8587
api_key = self._cfg["openai"]["api_key"]
8688
else:
87-
# Note that we assume here this env var exists and should throw an error if not
88-
api_key = os.environ["OPENAI_API_KEY"]
89+
api_key = os.getenv("OPENAI_API_KEY")
90+
if not api_key:
91+
raise ValueError("OpenAI api key not found, provide OPENAI_API_KEY either in the plugin config or as an env var.")
8992
llm_client = openai_client(api_key=api_key, model=self._cfg["model_id"])
9093
elif provider == "ollama":
9194
ollama_client = get_llm("litellm.ollama")
@@ -96,9 +99,13 @@ async def tool_post_invoke(self, payload: ToolPostInvokePayload, context: Plugin
9699
if len(self._cfg["anthropic"]["api_key"]) > 0:
97100
api_key = self._cfg["anthropic"]["api_key"]
98101
else:
99-
# Note that we assume here this env var exists and should throw an error if not
100-
api_key = os.environ["ANTHROPIC_API_KEY"]
102+
api_key = os.getenv("ANTHROPIC_API_KEY")
103+
if not api_key:
104+
raise ValueError("Anthropic api key not found, provide ANTHROPIC_API_KEY either in the plugin config or as an env var.")
101105
llm_client = anthropic_client(model_name=model_path, api_key=api_key)
106+
elif provider == "pytestmock":
107+
# only meant to be used for unit tests
108+
llm_client = None
102109
else:
103110
raise ValueError("Unknown provider given for 'llm_provider' in plugin config!")
104111

@@ -111,21 +118,28 @@ async def tool_post_invoke(self, payload: ToolPostInvokePayload, context: Plugin
111118
content = payload.result["content"][0]
112119
if "type" in content and content["type"] == "text":
113120
response_str = content["text"]
114-
try:
115-
response_json = json.loads(response_str)
116-
except json.decoder.JSONDecodeError:
117-
# ignore anything that's not json
118-
pass
119121

120-
if response_json and response_str and len(response_str) > self._cfg["length_threshold"]:
122+
if len(response_str) > self._cfg["length_threshold"]:
123+
try:
124+
response_json = json.loads(response_str)
125+
except json.decoder.JSONDecodeError:
126+
# ignore anything that's not json
127+
pass
128+
129+
# Should only get here if response is long enough and is valid JSON
130+
if response_json:
121131
logger.info("Long JSON response detected, using ALTK JSON Processor...")
122-
codegen = CodeGenerationComponent(config=config)
123-
nl_query = self._cfg["jsonprocessor_query"]
124-
input_data = CodeGenerationRunInput(messages=[], nl_query=nl_query, tool_response=response_json)
125-
output = codegen.process(input_data, AgentPhase.RUNTIME)
126-
output = cast(CodeGenerationRunOutput, output)
127-
payload.result["content"][0]["text"] = output.result
128-
logger.debug(f"ALTK processed response: {output.result}")
132+
if provider == "pytestmock":
133+
# only meant for unit testing
134+
payload.result["content"][0]["text"] = "(filtered response)"
135+
else:
136+
codegen = CodeGenerationComponent(config=config)
137+
nl_query = self._cfg.get("jsonprocessor_query", "")
138+
input_data = CodeGenerationRunInput(messages=[], nl_query=nl_query, tool_response=response_json)
139+
output = codegen.process(input_data, AgentPhase.RUNTIME)
140+
output = cast(CodeGenerationRunOutput, output)
141+
payload.result["content"][0]["text"] = output.result
142+
logger.debug(f"ALTK processed response: {output.result}")
129143
return ToolPostInvokeResult(continue_processing=True, modified_payload=payload)
130144

131145
return ToolPostInvokeResult(continue_processing=True)

plugins/altk_json_processor/plugin-manifest.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,6 @@ description: "Uses JSON Processor from ALTK to extract data from long JSON respo
22
author: "Jason Tsay"
33
version: "0.1.0"
44
available_hooks:
5-
- "tool_post_hook"
5+
- "tool_post_invoke"
66
default_configs:
7+
length_threshold: 100000

plugins/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -887,7 +887,7 @@ plugins:
887887
config:
888888
jsonprocessor_query: ""
889889
llm_provider: "watsonx" # one of watsonx, ollama, openai, anthropic
890-
watsonx:
890+
watsonx: # each section of providers is optional
891891
wx_api_key: "" # optional, can define WX_API_KEY instead
892892
wx_project_id: "" # optional, can define WX_PROJECT_ID instead
893893
wx_url: "https://us-south.ml.cloud.ibm.com"

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ dev = [
116116
"pydocstyle>=6.3.0",
117117
"pylint>=3.3.9",
118118
"pylint-pydantic>=0.3.5",
119-
#"pyre-check>=0.9.25", # incompatible with altk, superceded by pyrefly?
119+
#"pyre-check>=0.9.25", # unused, conflicts with altk, superceded by pyrefly
120120
"pyrefly>=0.35.0",
121121
"pyright>=1.1.406",
122122
"pyroma>=5.0",
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# -*- coding: utf-8 -*-
2+
"""Location: ./tests/unit/mcpgateway/plugins/plugins/altk_json_processor/test_json_processor.py
3+
Copyright 2025
4+
SPDX-License-Identifier: Apache-2.0
5+
Authors: Jason Tsay
6+
7+
Tests for ALTKJsonProcessor.
8+
"""
9+
10+
# Standard
11+
import json
12+
13+
# Third-Party
14+
import pytest
15+
16+
# First-Party
17+
from mcpgateway.plugins.framework.models import (
18+
GlobalContext,
19+
HookType,
20+
PluginConfig,
21+
PluginContext,
22+
ToolPostInvokePayload,
23+
)
24+
25+
try:
26+
# ALTK may not be available due to being an optional dependency, skip if not available
27+
# First-Party
28+
from plugins.altk_json_processor.json_processor import ALTKJsonProcessor
29+
except ModuleNotFoundError:
30+
pytest.mark.skip(reason="altk not available")
31+
32+
33+
@pytest.mark.asyncio
34+
async def test_threshold():
35+
plugin = ALTKJsonProcessor( # type: ignore
36+
PluginConfig(
37+
name="jsonprocessor", kind="plugins.altk_json_processor.json_processor.ALTKJsonProcessor", hooks=[HookType.TOOL_POST_INVOKE], config={"llm_provider": "pytestmock", "length_threshold": 50}
38+
)
39+
)
40+
ctx = PluginContext(global_context=GlobalContext(request_id="r1"))
41+
# below threshold, so the plugin should not activate
42+
too_short = {"a": "1", "b": "2"}
43+
too_short_payload = {"content": [{"type": "text", "text": json.dumps(too_short)}]}
44+
res = await plugin.tool_post_invoke(ToolPostInvokePayload(name="x1", result=too_short_payload), ctx)
45+
assert res.modified_payload is None
46+
long_enough = {
47+
"a": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.",
48+
"b": "Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
49+
"c": "Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.",
50+
"d": "Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.",
51+
}
52+
# above threshold, so the plugin should activate
53+
long_enough_payload = {"content": [{"type": "text", "text": json.dumps(long_enough)}]}
54+
res = await plugin.tool_post_invoke(ToolPostInvokePayload(name="x2", result=long_enough_payload), ctx)
55+
assert res.modified_payload is not None
56+
assert res.modified_payload.result["content"][0]["text"] == "(filtered response)"

0 commit comments

Comments
 (0)