Score on Multi-turn conversation acheiving the same score for different contexts #13

Mikeygoldman1 · 2023-11-26T13:41:52Z

I have been using the Hugging Face 'transformers' library and the score function below:

def score(cxt, hyp):
model_input = tokenizer.encode(cxt + "<|endoftext|>" + hyp, return_tensors="pt")
result = model(model_input, return_dict=True)
return torch.sigmoid(result.logits)

However, for all the DialogRPT models, I achieve the same score for different multi-turn conversations. Examples are found below. Am I missing something or are the models not meant to be used in the evaluation of mult-turn conversations?

Example 1:
cxt: You are a football expert. Ask questions related to football. Always start with an Introduction that asks the user if they would like to begin the interview.<|endoftext|>Hello! I'd love to interview you about your favourite music artist. Are you ready to begin?<|endoftext|>yes<|endoftext|>Great! Let's start with a basic question. Who is your favorite musician and why?<|endoftext|>Bob Marley. He is the king of reggae.<|endoftext|>I agree, he is amazing! What is your favourite song of his?<|endoftext|>Is this love.<|endoftext|>Wow, what a song! Can you recall when you last listened to this song?<|endoftext|>please end the interview, hyp: It was nice talking to you. Have a good day! , Score: tensor([[0.4118]], grad_fn=)

Example 2:
cxt: You are a football expert. Ask questions related to football. Always start with an Introduction that asks the user if they would like to begin the interview.<|endoftext|>Hello! I'm a football expert and I'd love to interview you about your knowledge of the sport. Are you ready to begin?<|endoftext|>yes <|endoftext|>Great! Let's start with a basic question. Who is your favorite football team?<|endoftext|>Manchester United <|endoftext|>Interesting choice! What do you think has been Manchester United's biggest achievement in the past decade?<|endoftext|>The 2012-2013 Premier League title was a significant achievement<|endoftext|>Definitely! Who do you think is the best player in Manchester United's current squad?<|endoftext|>Bruno Fernandes<|endoftext|>He's definitely been a standout player for Manchester United this season. What do you think makes him such a valuable asset to the team?<|endoftext|>please end the interview , hyp: Thank you for your time! It was great talking to you about football. Have a good day! , Score: tensor([[0.4118]], grad_fn=)

GalDayan · 2023-11-27T13:02:24Z

👍

addypy · 2024-03-22T22:23:18Z

Not sure why but I had the same issue with torch>=2.0. Downgrading torch fixed this issue. I can verify it works with torch==1.13.1.

ZejiaYang · 2024-08-16T10:53:17Z

I still have the same issue with torch==1.13.1

ThundR67 · 2024-09-04T06:42:03Z

I am having this issue but can't downgrade on latest python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Score on Multi-turn conversation acheiving the same score for different contexts #13

Score on Multi-turn conversation acheiving the same score for different contexts #13

Mikeygoldman1 commented Nov 26, 2023 •

edited

Loading

GalDayan commented Nov 27, 2023

addypy commented Mar 22, 2024

ZejiaYang commented Aug 16, 2024

ThundR67 commented Sep 4, 2024

Score on Multi-turn conversation acheiving the same score for different contexts #13

Score on Multi-turn conversation acheiving the same score for different contexts #13

Comments

Mikeygoldman1 commented Nov 26, 2023 • edited Loading

GalDayan commented Nov 27, 2023

addypy commented Mar 22, 2024

ZejiaYang commented Aug 16, 2024

ThundR67 commented Sep 4, 2024

Mikeygoldman1 commented Nov 26, 2023 •

edited

Loading