Skip to main content

QwenLM

QwenLM
QwenLM/Qwen2.5-VL
- Collection Qwen2.5-VL
  - 3B, 7B, 32B, 72B
QvQ
- visual reasoning
Qwen
- 大模型
- Omni
  - text, audio, image, video, natural speech interaction

VL

<box></box>
<|box_start|>
<|box_end|>
<|object_ref_start|><|object_ref_end|>
min_pixels = 2562828
max_pixels = 12802828
FineTune

FAQ

macOS Dimension out of range

model_path = "Qwen/Qwen2.5-VL-3B-Instruct"
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16,
    attn_implementation="eager", # 修改这个
    device_map="mps"
)

min_pixels = 256*28*28
max_pixels = 1280*28*28
processor = AutoProcessor.from_pretrained("Qwen/Qwen2.5-VL-3B-Instruct", min_pixels=min_pixels, max_pixels=max_pixels)

https://github.com/QwenLM/Qwen2.5-VL/issues/760

VL
FAQ
- macOS Dimension out of range