latest/version3.x/pipeline_usage/PaddleOCR-VL #16734
Replies: 42 comments 44 replies
-
|
The recognition accuracy seems to have some hints, but there is an obvious flaw. The recognition speed seems to be much slower than PP-STuctureV3 or PP-OCRv5. When using PP-STuctureV3 or PP-OCRv5, the RTX 4090 48GB consumes only 5-7GB of video memory. However, although PaddleOCR-VL-0.9B does have some hints about accuracy and consumes the same amount of video memory resources as PP-STuctureV3, its speed has decreased by at least 20-25 times. This is a common problem with Ai models and other visual models on the market today. Overall, we are not missing a model, but a "short and efficient" model that consumes GPU. A model with fewer resources, higher recognition accuracy, and faster speed must be a model that can transcend time |
Beta Was this translation helpful? Give feedback.
-
|
支持的cuda版本最小是多少 |
Beta Was this translation helpful? Give feedback.
-
|
使用官方镜像:ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.2.0-gpu-cuda12.6-cudnn9.5 报错: |
Beta Was this translation helpful? Give feedback.
-
|
本地部署推理服务后,怎么增加前端并连通后端服务呢,有例子吗 |
Beta Was this translation helpful? Give feedback.
-
|
客户端调用是这个意思吗? |
Beta Was this translation helpful? Give feedback.
-
|
这目前只支持页面作为输入? 没有支持单个元素识别的方法吗? |
Beta Was this translation helpful? Give feedback.
-
|
docker run |
Beta Was this translation helpful? Give feedback.
-
|
太难部署了 最好的方法就是把官方的dokcer.yaml文件发出来给大家参考一下 |
Beta Was this translation helpful? Give feedback.
-
|
请问怎么用paddlex部署本地下载的PaddleOCR-VL模型,没有参数可以指定模型路径,修改配置文件好像也不行 |
Beta Was this translation helpful? Give feedback.
-
|
registry.baidubce.com/paddlepaddle/paddle:3.2.0-gpu-cuda12.9-cudnn9.9 (及其他安装指南推荐)的3.2.0镜像是否支持3.3.0的pipeline模式?如果不支持能否提供适用于3.3.0的镜像? |
Beta Was this translation helpful? Give feedback.
-
|
目前哪些云厂商支持直接使用啦 |
Beta Was this translation helpful? Give feedback.
-
|
效果比之前的模型好多了,对于复杂表格识别仍有一两处错误,paddleocr什么时候可以支持二次微调? |
Beta Was this translation helpful? Give feedback.
-
|
为什么检测锚框识别出了header,但是输出的md中却没有header呢;另外是否支持自定义prompt |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
预计什么时候出支持CPU的版本 |
Beta Was this translation helpful? Give feedback.
-
|
什么时候能出个使用vllm的离线推理demo啊 |
Beta Was this translation helpful? Give feedback.
-
|
请问 |
Beta Was this translation helpful? Give feedback.
-
|
你好,文本位置有bug,明明bbox有2个,但是确将文本放入了一个bbox中,例如下面第10题是第二个bbox中的: |
Beta Was this translation helpful? Give feedback.
-
|
按照1和4的内容部署了服务,并且通过http调用了接口。接口返回了prunedResult 和 md 格式的文档。但是现在我只需要用到 md 的文档,能不能通过什么设置,让接口不返回 prunedResult 字段,减少网络请求? |
Beta Was this translation helpful? Give feedback.
-
|
怎么判断是否有用上推理加速框架?感觉用和没用的耗时差不多 |
Beta Was this translation helpful? Give feedback.
-
|
请问大家PaddleOCR-VL的运行需要多少显存?我用RTX3090 24G报错显存不足~~ 如果用两块24G显存怎么部署? |
Beta Was this translation helpful? Give feedback.
-
|
window服务化部署 我用的是Docker Compose 部署的,然后paddlepaddle/paddleocr-genai-vllm-server:latest-offline 启动过程中报错 cuda 12.8 gpu算力7.5 ,8GB 报错如下,该如何解决啊 (EngineCore_DP0 pid=45) Traceback (most recent call last): (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap (EngineCore_DP0 pid=45) self.run() (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/multiprocessing/process.py", line 108, in run (EngineCore_DP0 pid=45) self._target(*self._args, **self._kwargs) (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 722, in run_engine_core (EngineCore_DP0 pid=45) raise e (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 709, in run_engine_core (EngineCore_DP0 pid=45) engine_core = EngineCoreProc(*args, **kwargs) (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 505, in init (EngineCore_DP0 pid=45) super().init(vllm_config, executor_class, log_stats, (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 91, in init (EngineCore_DP0 pid=45) self._initialize_kv_caches(vllm_config) (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 192, in _initialize_kv_caches (EngineCore_DP0 pid=45) kv_cache_configs = [ (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 193, in (EngineCore_DP0 pid=45) get_kv_cache_config(vllm_config, kv_cache_spec_one_worker, (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/site-packages/vllm/v1/core/kv_cache_utils.py", line 1110, in get_kv_cache_config (EngineCore_DP0 pid=45) check_enough_kv_cache_memory(vllm_config, kv_cache_spec, available_memory) (EngineCore_DP0 pid=45) File "/usr/local/lib/python3.10/site-packages/vllm/v1/core/kv_cache_utils.py", line 691, in check_enough_kv_cache_memory (EngineCore_DP0 pid=45) raise ValueError("No available memory for the cache blocks. " (EngineCore_DP0 pid=45) ValueError: No available memory for the cache blocks. Try increasing |
Beta Was this translation helpful? Give feedback.
-
|
能不能完善下微调的文档,包括微调后vllm部署 |
Beta Was this translation helpful? Give feedback.
-
|
支持设置参数 如果遇到pdf中有图片和图表的 不想识别的 可以原封不动的返回给图片吗 |
Beta Was this translation helpful? Give feedback.
-
|
想问下,使用 Docker Compose 部署的服务返回结果貌似和pipeline.predict()方法有不少差异。 |
Beta Was this translation helpful? Give feedback.
-
|
使用官方地址测试驾驶证,开启 “enable document unwarping” 功能,识别结果很准确,但是使用 PaddleOCR-VL模型,通过服务化部署的 API,post 方式的 layout-parsing接口调用,出生日期这些项无法解析,是什么原因? |
Beta Was this translation helpful? Give feedback.
-
|
请问如果对 服务化部署后的 接口调用识别结果进行删减? |
Beta Was this translation helpful? Give feedback.
-
|
啥时候适配昇腾NPU啊 |
Beta Was this translation helpful? Give feedback.
-
|
本地部署后 对于pdf 文件中手写体 部分的识别的效果很差 无法对齐官方demo 请问要调整什么参数增强手写体部分的识别 |
Beta Was this translation helpful? Give feedback.
-
|
我想了解一下,cpu推理超级慢有什么解决方法吗?同一页pdf,比L20慢了26倍,有什么cpu加速方法呢? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
latest/version3.x/pipeline_usage/PaddleOCR-VL
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL.html
Beta Was this translation helpful? Give feedback.
All reactions