Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Score on Multi-turn conversation acheiving the same score for different contexts #13

Open
Mikeygoldman1 opened this issue Nov 26, 2023 · 4 comments

Comments

@Mikeygoldman1
Copy link

Mikeygoldman1 commented Nov 26, 2023

Hi @golsun,

I have been using the Hugging Face 'transformers' library and the score function below:

def score(cxt, hyp):
model_input = tokenizer.encode(cxt + "<|endoftext|>" + hyp, return_tensors="pt")
result = model(model_input, return_dict=True)
return torch.sigmoid(result.logits)

However, for all the DialogRPT models, I achieve the same score for different multi-turn conversations. Examples are found below. Am I missing something or are the models not meant to be used in the evaluation of mult-turn conversations?

Example 1:
cxt: You are a football expert. Ask questions related to football. Always start with an Introduction that asks the user if they would like to begin the interview.<|endoftext|>Hello! I'd love to interview you about your favourite music artist. Are you ready to begin?<|endoftext|>yes<|endoftext|>Great! Let's start with a basic question. Who is your favorite musician and why?<|endoftext|>Bob Marley. He is the king of reggae.<|endoftext|>I agree, he is amazing! What is your favourite song of his?<|endoftext|>Is this love.<|endoftext|>Wow, what a song! Can you recall when you last listened to this song?<|endoftext|>please end the interview, hyp: It was nice talking to you. Have a good day! , Score: tensor([[0.4118]], grad_fn=)

Example 2:
cxt: You are a football expert. Ask questions related to football. Always start with an Introduction that asks the user if they would like to begin the interview.<|endoftext|>Hello! I'm a football expert and I'd love to interview you about your knowledge of the sport. Are you ready to begin?<|endoftext|>yes <|endoftext|>Great! Let's start with a basic question. Who is your favorite football team?<|endoftext|>Manchester United <|endoftext|>Interesting choice! What do you think has been Manchester United's biggest achievement in the past decade?<|endoftext|>The 2012-2013 Premier League title was a significant achievement<|endoftext|>Definitely! Who do you think is the best player in Manchester United's current squad?<|endoftext|>Bruno Fernandes<|endoftext|>He's definitely been a standout player for Manchester United this season. What do you think makes him such a valuable asset to the team?<|endoftext|>please end the interview , hyp: Thank you for your time! It was great talking to you about football. Have a good day! , Score: tensor([[0.4118]], grad_fn=)

@GalDayan
Copy link

👍

@addypy
Copy link

addypy commented Mar 22, 2024

Not sure why but I had the same issue with torch>=2.0. Downgrading torch fixed this issue. I can verify it works with torch==1.13.1.

@ZejiaYang
Copy link

I still have the same issue with torch==1.13.1

@ThundR67
Copy link

ThundR67 commented Sep 4, 2024

I am having this issue but can't downgrade on latest python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants