Question About Clip-level retrieval

Your work is very impressive!

I want to clarify: when evaluating clip-level retrieval, are the candidates all clips in the test set, or only those from the same video?