Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a tensorrt backend #33

Draft
wants to merge 22 commits into
base: master
Choose a base branch
from
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add support for multi GPU
Zhao Wang committed Apr 30, 2021
commit 659d003f1b994691bafd49ae2f96bb292fe3109e
19 changes: 12 additions & 7 deletions vcap_utils/vcap_utils/backends/base_tensorrt.py
Original file line number Diff line number Diff line change
@@ -2,19 +2,15 @@

import pycuda.driver as cuda
import tensorrt as trt
import pycuda.autoinit

from typing import Dict, List, Tuple, Optional, Any

from vcap import (
Crop,
DetectionNode,
Resize,
DETECTION_NODE_TYPE,
OPTION_TYPE,
BaseStreamState,
BaseBackend,
rect_to_coords,
)


@@ -39,13 +35,16 @@ def __init__(self, inputs_, outputs_, bindings_, stream_):


class BaseTensorRTBackend(BaseBackend):
def __init__(self, engine_bytes, width, height):
def __init__(self, engine_bytes, width, height, device_id):
super().__init__()
gpu_devide_id = int(device_id[4:])
cuda.init()
dev = cuda.Device(gpu_devide_id)
self.ctx = dev.make_context()
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
self.trt_runtime = trt.Runtime(TRT_LOGGER)
# load the engine
self.trt_engine = self.trt_runtime.deserialize_cuda_engine(engine_bytes)

# create execution context
self.context = self.trt_engine.create_execution_context()
# create buffers for inference
@@ -147,6 +146,7 @@ def allocate_buffers(self, batch_size: int = 1) -> \
def do_inference(self, bindings: List[int], inputs: List[HostDeviceMem], outputs: List[HostDeviceMem],
Copy link
Contributor

@apockill apockill Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def do_inference(self, bindings: List[int], inputs: List[HostDeviceMem], outputs: List[HostDeviceMem],
def _do_inference(self, bindings: List[int],
inputs: List[HostDeviceMem],
outputs: List[HostDeviceMem],
stream: cuda.Stream,
batch_size: int = 1) -> List[List[float]]:

stream: cuda.Stream, batch_size: int = 1) -> List[List[float]]:
# Transfer input data to the GPU.
self.ctx.push()
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
# Run inference.
# todo: use async or sync api?
@@ -172,6 +172,7 @@ def do_inference(self, bindings: List[int], inputs: List[HostDeviceMem], outputs
for batch_output in batch_outputs:
final_output.append(batch_output[i])
final_outputs.append(final_output)
self.ctx.pop()
return final_outputs

def _prepare_post_process(self):
Copy link
Contributor

@apockill apockill Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think that are too many constants and GridNet specific functions here, and it might be easier to make a separate class specifically for parsing GridNet bounding boxes.

For now, let's clean up the rest of the code first, then discuss how that would work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants are only necessary for detectors, maybe we need another parameter like is_detector in the constructor to indicate if this capsule a detector or classifier?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can check if these constants exist before we call the post process function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but I'm thinking that this is super duper specific to GridNet detectors particularly. Maybe we can just offer a function that for parsing GridNet detector outputs, and name it as such.

class GridNetParser:
   def __init__(parameters):
     ...
   def parse_detection_results(prediction):
     ...

class BaseTensorRTBackend:
   ...

The benefit would be to separate all of these GridNet specific parameters out of the BaseTensorRTBackend 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, we should have separate parsers for different architectures.

@@ -216,7 +217,7 @@ def _apply_box_norm(self, o1: float, o2: float, o3: float, o4: float, x: int, y:
o4 = (o4 + self.grid_centers_h[y]) * self.box_norm
return o1, o2, o3, o4

def postprocess(self, outputs: List[float], min_confidence: float, analysis_classes: List[int], wh_format=True)-> \
def postprocess(self, outputs: List[float], min_confidence: float, analysis_classes: List[int], wh_format=True) -> \
Tuple[List[List[int]], List[int], List[float]]:
"""
Postprocesses the inference output
@@ -262,3 +263,7 @@ def postprocess(self, outputs: List[float], min_confidence: float, analysis_clas
scores.append(float(score))

return bbs, class_ids, scores

def close(self):
super().close()
self.ctx.pop()