Truncated Response Issue ? #2762

fredmo · 2025-02-26T18:41:36Z

Hello,

I am encountering an issue with the llama-3.1-405b model when using it through a Python script. Here is the script I am using:

import g4f, asyncio, sys
from g4f.client import Client
import subprocess

def gpt_free(QUERY):
client = Client()
response = client.chat.completions.create(
model="llama-3.1-405b",
messages=[{"role": "user", "content": QUERY}],
web_search=False,
)
return response.choices[0].message.content

if name == "main":
arguments = sys.stdin.read().splitlines()
QUERY = ' '.join(arguments)
ANSWER = gpt_free(QUERY)
print(ANSWER)

Execution Command:

my_script.py "my_question"

Issue:
The response returned by response.choices[0].message.content is truncated. The end of the reply is missing, particularly when I ask the model to generate a Python script.

Questions:
Is there a parameter to set a maximum size for the response to avoid truncation?
If the response is truncated, is there a way to access the rest of the response?
Are there known limitations with the llama-3.1-405b model regarding response length?

Additional Information:
I am requesting the model to generate Python scripts, which require relatively long responses.
I have tried adjusting the max_tokens parameter, but the issue persists.

Thank you for your assistance.
Best regards,

hlohaus · 2025-02-27T09:42:24Z

Several providers impose internal maximum token limits. The G4F platform also supports a max_tokens parameter; however, only the HuggingFace limit is currently defined within G4F, which is set at 4000 tokens total, with a 2000-token limit for both input and generated text. Consequently, increasing the token limit within G4F is not possible. @fredmo

fredmo · 2025-02-27T13:55:28Z

I have the feeling that a month ago, I hadn't the truncated respond.
I change for a big max_tokens without seen the improvement. ( yes a small max_tokens will trunc more )

Isn't there a timer somewhere to configure to wait the answer? Could it be a timeout too short ?
Or another element in the "response" structure to check to have more details?

hlohaus · 2025-02-27T14:52:41Z

To facilitate troubleshooting, please enable debug logging on your server by adding the --debug argument. Additionally, review the response to identify the responding provider. @fredmo

fredmo added the bug Something isn't working label Feb 26, 2025

fredmo assigned xtekky Feb 26, 2025

hlohaus added the respond label Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Truncated Response Issue ? #2762

Truncated Response Issue ? #2762

fredmo commented Feb 26, 2025

hlohaus commented Feb 27, 2025

fredmo commented Feb 27, 2025 •

edited

Loading

hlohaus commented Feb 27, 2025

Truncated Response Issue ? #2762

Truncated Response Issue ? #2762

Comments

fredmo commented Feb 26, 2025

hlohaus commented Feb 27, 2025

fredmo commented Feb 27, 2025 • edited Loading

hlohaus commented Feb 27, 2025

fredmo commented Feb 27, 2025 •

edited

Loading