Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STT_EN_FASTCONFORMER_TRANSDUCER_XLARG - Throws error for Tensor + List operation in confidence calculation #10066

Open
Vladi-SmartAssets opened this issue Aug 7, 2024 · 1 comment · May be fixed by #10519
Assignees
Labels
bug Something isn't working

Comments

@Vladi-SmartAssets
Copy link

Describe the bug

TypeError: unsupported operand type(s) for +: 'Tensor' and 'list', occurs when wanting to extract the confidence levels for the STT FastConformer model.

Steps/Code to reproduce bug

`class NemoModel(HuggingFaceBaseModel):

def __init__(self, model_name, model_path):
    super().__init__(model_name)
    self.model_path = model_path
    self.model = None

def load_model(self):
    self.model = nemo_asr.models.EncDecRNNTBPEModel.restore_from(self.model_path, map_location="mps")

def predict(self, input_paths):

    confidence_cfg = ConfidenceConfig(
        preserve_frame_confidence=True,  # Internally set to true if preserve_token_confidence == True
        # or preserve_word_confidence == True
        preserve_token_confidence=True,  # Internally set to true if preserve_word_confidence == True
        preserve_word_confidence=True,
        aggregation="prod",  # How to aggregate frame scores to token scores and token scores to word scores
        exclude_blank=False,  # If true, only non-blank emissions contribute to confidence scores
        tdt_include_duration=False,  # If true, calculate duration confidence for the TDT models
        method_cfg=ConfidenceMethodConfig(  # Config for per-frame scores calculation (before aggregation)
            name="max_prob",  # Or "entropy" (default), which usually works better
            entropy_type="gibbs",  # Used only for name == "entropy". Recommended: "tsallis" (default) or "renyi"
            alpha=0.5,  # Low values (<1) increase sensitivity, high values decrease sensitivity
            entropy_norm="lin",  # How to normalize (map to [0,1]) entropy. Default: "exp"
        ),
    )
    self.model.change_decoding_strategy(RNNTDecodingConfig(fused_batch_size=-1, strategy="greedy_batch", confidence_cfg=confidence_cfg))

    transcriptions = self.model.transcribe(
        audio=input_paths, return_hypotheses=True
    )
    
    fastconformer_transcriptions = [x for x in transcriptions][0]

    return fastconformer_transcriptions

`
This when run with model.transcribe will throw the following error:

for ts, te in zip(hyp.timestep, hyp.timestep[1:] + [len(hyp.frame_confidence)]): TypeError: unsupported operand type(s) for +: 'Tensor' and 'list'

Expected behavior

Expected behaviour is for the zip function to take the torch.tensor not the list, as hyp.timestamps is a Tensor and hyp.frame_confidence is a list of tensors. (Tensor[float], List[Tensor[float]]

Environment overview (please complete the following information)

  • Environment location: Docker
  • Method of NeMo install: pip install nemo

Environment details

If NVIDIA docker image is used you don't need to specify these.
Otherwise, please provide:

  • OS version: MacOS 14.5 (23F79)
  • PyTorch version: 2.3.1
  • Python version: 3.10

Additional context

Add any other context about the problem here.
Example: Using MPS

Proposed solution:
replace the following line 633 in nemo/collections/asr/parts/submodules/rnnt_decoding.py:

for ts, te in zip(hyp.timestep, hyp.timestep[1:] + [len(hyp.frame_confidence)]):

with

for ts, te in zip(hyp.timestep, hyp.timestep[1:] + len(hyp.frame_confidence)):

@Vladi-SmartAssets Vladi-SmartAssets added the bug Something isn't working label Aug 7, 2024
@GNroy GNroy self-assigned this Aug 17, 2024
@GNroy
Copy link
Collaborator

GNroy commented Aug 22, 2024

@Vladi-SmartAssets Hi,
I cannot reproduce the issue in the latest main.
What NeMo version are you using?

@GNroy GNroy linked a pull request Sep 18, 2024 that will close this issue
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants