How would you describe them?
A 3B-active-parameter native unified multimodal model for image and video understanding, generation, and editing.