Skip to content
View whuhxb's full-sized avatar
💭
On The Road!!
💭
On The Road!!

Block or report whuhxb

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • Python GNU General Public License v3.0 Updated Mar 31, 2025
  • Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

    Jupyter Notebook Apache License 2.0 Updated Mar 30, 2025
  • ProLIP-1 Public

    Forked from astra-vision/ProLIP

    CLIP's Visual Embedding Projector is a Few-shot Cornucopia

    Shell Updated Mar 30, 2025
  • dita-ot Public

    Forked from dita-ot/dita-ot

    DITA Open Toolkit — the open-source publishing engine for content authored in the Darwin Information Typing Architecture.

    Java Apache License 2.0 Updated Mar 28, 2025
  • ✈️ Accelerating Vision Diffusion Transformers with Skip Branches.

    Python Apache License 2.0 Updated Mar 28, 2025
  • uni4d Public

    Forked from Davidyao99/uni4d
    Python MIT License Updated Mar 28, 2025
  • splatnav Public

    Forked from chengine/splatnav
    Python MIT License Updated Mar 28, 2025
  • Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

    Python Apache License 2.0 Updated Mar 27, 2025
  • EVolSplat Public

    Forked from Miaosheng1/EVolSplat

    official code of CVPR2025 Evolsplat

    Python Other Updated Mar 27, 2025
  • OpenSDI Public

    Forked from iamwangyabin/OpenSDI

    Official repository for CVPR 2025 paper: OpenSDI: Spotting Diffusion-Generated Images in the Open World

    Python Updated Mar 26, 2025
  • LPOSS Public

    Forked from vladan-stojnic/LPOSS

    Code for LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation (CVPR2025)

    Python MIT License Updated Mar 26, 2025
  • surg-3m Public

    Forked from visurg-ai/surg-3m

    Official repository for the paper "Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings".

    Python Updated Mar 26, 2025
  • TokenHSI Public

    Forked from liangpan99/TokenHSI

    [CVPR 2025] TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization

    Updated Mar 26, 2025
  • Python MIT License Updated Mar 26, 2025
  • PanoGS Public

    Forked from zhaihongjia/PanoGS

    [CVPR 2025] PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding

    Apache License 2.0 Updated Mar 26, 2025
  • [CVPR'25] Official PyTorch implementation of AvatarArtist: Open-Domain 4D Avatarization.

    Python Apache License 2.0 Updated Mar 26, 2025
  • oor Public

    Forked from snuvclab/oor
    Updated Mar 26, 2025
  • PAVE Public

    Forked from dragonlzm/PAVE

    This repo holds the implementation of PAVE: Patching and Adapting Video Large Language Models (CVPR2025)

    Python Updated Mar 26, 2025
  • CAFe Public

    Forked from haoyu-bu/CAFe

    Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"

    Python Apache License 2.0 Updated Mar 26, 2025
  • PS3 Public

    Forked from NVlabs/PS3

    Scaling Vision Pre-Training to 4K Resolution

    Updated Mar 26, 2025
  • [CVPR 2025] Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models

    Python Updated Mar 26, 2025
  • Updated Mar 25, 2025
  • CamSAM2 Public

    Forked from zhoustan/CamSAM2
    Updated Mar 25, 2025
  • Change3D Public

    Forked from zhuduowang/Change3D

    The official code of Change3D: Revisiting Change Detection and Captioning from A Video Modeling Perspective.

    Python Updated Mar 25, 2025
  • [CVPR 2025] Good Keypoints for the Two-View Geometry Estimation Problem. The reference implementation of the paper.

    Updated Mar 25, 2025
  • Splat-LOAM Public

    Forked from rvp-group/Splat-LOAM

    2D Gaussian Splatting based LiDAR Odometry And Mapping

    BSD 3-Clause "New" or "Revised" License Updated Mar 25, 2025
  • CoMP-MM Public

    Forked from SliMM-X/CoMP-MM

    Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"

    Python Apache License 2.0 Updated Mar 25, 2025
  • Beyond Accuracy: What Matters in Designing Well-Behaved Models?

    Python Apache License 2.0 Updated Mar 25, 2025
  • Python Apache License 2.0 Updated Mar 25, 2025
  • NexusGS Public

    Forked from USMizuki/NexusGS

    [CVPR'25] NexusGS: Sparse View Synthesis with Epipolar Depth Priors in 3D Gaussian Splatting

    JavaScript Updated Mar 25, 2025