Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question in class GSS in core_chime6.py #21

Open
elissopp opened this issue Mar 9, 2022 · 6 comments
Open

A question in class GSS in core_chime6.py #21

elissopp opened this issue Mar 9, 2022 · 6 comments

Comments

@elissopp
Copy link

elissopp commented Mar 9, 2022

In line 198 of file core_chime6.py, method predict is used with a parameter source_activity_mask

affiliation = cur.predict(
                   Obs.T[f, ...],
                   source_activity_mask=source_active_mask[f, ..., :T]
               )

But in the definition of the object cur (as well as class CACGMM in pb_bss/distribution/cacgmm.py), the method predict doesn't have this parameter. Simply changing predict to _predictdoesn't help.

Thanks a lot if you can answer when you are free

@boeddeker
Copy link
Member

Thank you for reporting this.

My local version of pb_bss has some changes that are not published.
I forgot to publish this, because for testing the code, I always used iterations_post == 1, hence that code was never executed in the test.

The difference between predict and _predict is, that the first one is comfortable to use and equal between all distributions.
The second is for internal usage and does no overhead computations (e.g. normalizing the input and transpose it) and returns model distribution specific states.

I will fix this in fgnt/pb_bss#34 .
In the meantime, you could change iterations_post from 0 to 1 (i.e. remove the source activity constraint in the final a posteriori /affiliation estimation). I forgot, how the performance changed by this parameter, but at least it had no negative effect.

@elissopp
Copy link
Author

elissopp commented Mar 9, 2022

Thanks a lot for your rapid reply and clear answer. I will change as you said.

Besides, I have a small question about the GSS code. Known from your paper, GSS can avoid the permutation problem by utilizing oracle time annotations. While separated cACGMM needs extra permutation alignment. But comparing the class GSS and CACGMM, GSS directly use the method predict without time annotation. So is permutation problem solved in the procedure fit with initialization? In fact, I'm not very familiar with the code of mixture models QAQ.

Thank you again for answering my question

@elissopp
Copy link
Author

elissopp commented Mar 9, 2022

I mean that without given source_activity_mask in method predict, like

affiliation = cur.predict(
                   Obs.T[f, ...]
               )

In this way, is the permutation problem still exist?

@boeddeker
Copy link
Member

There are two ideas to produce most likely a permutation free solution:

  • The a posteriori probability / affiliation is initialized with the activity of the sources.
  • Several iterations are done, where the a posteriori probability / affiliation is constraint to the activity of the source

I said most likely, because the activity pattern between the speakers and the always active noise must be sufficient different and the speakers must have different spatial properties (e.g. large enough angle between the speakers from the array perspective).
Note: I don't know how to define or measure sufficient different. I know, that they have to be different.

While it could be enough to start with a permutation free initialization, I observed that the EM-Algorithm sometimes has issues to keep it permutation free. Maybe it is caused by a too similar activity pattern, or they are spatially too similar.

Once the model is converged, it is unlikely that the permutation will change, so the constraint is no longer nessesary.
If you set the iterations_post to 1, only one E-Step (predict) is executed and if then a permutation happen, you would also get issues with the following beamforming.

@elissopp
Copy link
Author

elissopp commented Mar 9, 2022

Thank you for your reply. I think I understand a lot now.

Hope everything goes well in your future research.

@boeddeker
Copy link
Member

You are welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants