-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama 3.3 70B Weird , gibberish outputs in production setup #3043
Comments
We are experiencing the same issue! |
Any chance you could test TGI 3.1.1? We fixed two prefix caching edge cases that can lead to long-term corruption. |
@danieldk We are deploying it today; will update here on this issue thread in a couple of weeks after we validate that the issue has dissappeared. Thank you very much for your work!! 😊 |
@danieldk turns out 3.1.1 has introduced some important changes in the docker image that breaks the container in our setup /tgi-entrypoint.sh line 5: ./.venv/bin/activate: No such file or directory
text-generation-launcher: error while loading shared libraries: libpython3.11.so.1.0: cannot open shared object file: No such file or directory. Our custom image simply contains the following: FROM huggingface/text-generation-inference:3.1.1
# Adds a non-root llm user to its own group for isolation
ENV UID=1000
ENV USER=llm
RUN groupadd -g "${UID}" "${USER}" && useradd -m -u "${UID}" -g "${USER}" "${USER}"
# Switch to non-root user and use their home as workdir
USER ${USER}
WORKDIR /home/${USER}
RUN mkdir -p -m 0744 /home/${USER}/cache
ENV HF_HOME=/home/${USER}/cache |
We should probably also stop running as root by default actually... |
Okay this is the fix:
Basically the user you create needs to be able to traverse the We reflected on the nonroot docker by default, but it's likely to mess other things (especially around the mounted folders) so we're not going to go ahead yet. |
@Narsil thanks for the fix/workaround; now the container is able to start however this warning/error still appears in the beginning of the startup, should I be worried? |
System Info
Runtime environment:
TGI ENV config:
All default values except the following:
(have tried with
float16
andbfloat16
with same outcome)NVIDIA-SMI Output:
Text-Generation-Launcher --env output:
Model
/info
output:Model: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct
Information
Tasks
Reproduction
We are running some ETLs that send requests to the
generate/
TGI model endpoint with some text information for the model to summarize (redacted for privacy purposes). These requests are often big in terms of n. of input tokens. Furthermore, we are running these requests in concurrent threads. We are observing that for some cases the model returns repeated weird outputs, and in some cases, it makes the model subsequent responses weird as well, even for simple queries. A restart of the TGI container solves this temporarily, until after the ETL is run again, which seems to "corrupt it" over time.Some detailed info on the requests we are running:
temperature=0.1, max_tokens=500
(rest not set, i.e. default)[37051,591,593,522,490,840,458,3700,4227,380,3144,4404,1949,3812,2606,1878,2132,1374,1241,397,1364,864,1323,782,956,722,686]
[475,329,322,366,319,346,395,416,506,290,537,483,531,533,511,456,499,481,398,298,350,367,313,365,345,379,345]
was451 company of company and2 and and not was is and and of the was is isiHHHHH andH andi and andH\\'t and and and and and and and and isi andH and H2 HiH of the the\\'t not it HH H andH the the is isH H is not the is isH H andHHHH andH andH and and and \" H and H2 the not onlyH is and and\\'t\\'t not is isH and and and and and is and and and and and and and and and andH is was\\' it is a H and and and and and and and and not not is not is was is is is the is not is and and and and and and and and and and and and and and and and and and and and and is and and and and is is was is and and and and and had had and and and and and and\\'t\\' not is was is and is not not\\' is is was not is is is was is is is and is and and and and and and and not\\'t is is and and and not not need have have and and and and and need need and is is is is not is is is is not was is is and and and and is not is is and and and and and is is and and and and and and and need is is is and and and and it is not not not not is not is is is is is is is is a and and and and is is was is and and and and and is is and and and and and and and is is and and and is is is is is is is is is is and and and and and and and is a and and and and\\'t\\'t\\'t is is and and and and is is is is and is and is is not and and and and and is a is is is is is the is is and and and and and and is is is a and and and is is is a is a and and and is a is and is is a is a is not is a is not is is and and and and and and is and and and and and is not is not is and and and is for is a and is and and and and is is is and is a is is and and and and and is is and and and and and and and and is and and and is the it\\'t it is is is is of is is is is is is is and is and is and is is is is is a and and and and is is the is and is and is is is is is is is is for is is is a is a and is is is is a is is is a is a is and and and and is a and and and and and and and is not is a H and and is not is a is a is a is a is a and and and and and and and is is is is a is is is is and is is is is and H and and and is not is is a and is and is a is is is and is/ is and and and and is is is is is is and and and and is is is is and and and\\'t is is is is is is is is is is is is is a is a is and is and is and and is is and is and and is\\'t is is is is is is is a is creating is is is and is and is not is a is
Expected behavior
Model outputs are the same (correct, non-gibberish) in all load scenarios.
Other issues that may be related
I believe this issue could be related and similar scenario may apply: #2871
@Narsil @OlivierDehaene @drbh
The text was updated successfully, but these errors were encountered: