Open
Description
When calling repo.walk
in a new thread, it blocks the main thread until it finished!!!
See the example code below, when run with a huge repo (qt5, chromium e.g.), the main thread won't print any message until the repo.walk ended. (Use iter also the same)
It seems that repo.diff
also have the problem.
from pygit2 import Repository, GIT_SORT_TOPOLOGICAL
from threading import Thread
import sys
import time
def thread_func(repo_dir):
repo = Repository(repo_dir)
print(">>>>>>>> begin diff")
commits = list(repo.walk(repo.head.target, GIT_SORT_TOPOLOGICAL))
#for commit in repo.walk(repo.head.target, GIT_SORT_TOPOLOGICAL):
# continue
print(">>>>>>>> end diff")
def test(repo_dir):
t = Thread(target=thread_func, args=[repo_dir])
t.start()
while t.is_alive():
print("main thread...")
time.sleep(0.01)
if __name__ == "__main__":
if len(sys.argv) != 2:
print(">>>>>>>> Invalid argument")
sys.exit(-1)
test(sys.argv[1])
Activity
jdavid commentedon May 3, 2020
That's not the behaviour I observe. If you replace
list(...)
by the for loop you will see many prints. In other words, it'slist
which is blocking, not pygit2. And that's expected in my opinion, read about the Python's GIL (Global Interpreter Lock):list
is a single call, so the GIL won't allow any other thread to run.You can either write the code differently, using a for loop, or go multiprocessing.
timxx commentedon May 5, 2020
As I mentioned, the for loop is the same here. My project also uses for loop, but it just hangs the GUI thread. On windows platform it even worse compare to Linux.
I will try to use multiprocessing to see if it have nice performance to
walk
on small repo.