Filter

Efficient Inference of text-to-video (OpenSora) on AWS Graviton Instances

Description

Why is this important?

This project investigates the deployment and optimization of text-to-video transformer models on Arm-based instances, leveraging CPU-only execution for cost-effective and scalable inference. Vision Transformers, though typically run on GPUs, are increasingly desire to operate in resource-constrained environments for power efficiency.

Project Summary

The aim of this project is to port, benchmark, and optimize a pre-trained ViT model (e.g., OpenSora) on Arm-based instances. This could include post-training quantization and investigation in how to speed up performance. Students will explore efficiency techniques such as INT8 quantization, refactoring of expensive operations, and memory-efficient transformer kernels, and compare results across GPU and CPU platforms. Deliverables include a reproducable inference pipeline and a technical report outlining bottlenecks and optimization strategies.

Prequisites

  • Intemediate understanding of Python.
  • Understanding of transformer architectures, vision transformer architectures and inference optimization
  • Experience using PyTorch or ONNX Runtime (CPU execution provider)
  • Experience with libraries such as Hugging Face Transformers, torchvision
  • Access to Arm-based instances such as AWS Graviton3/Graviton4 (c7g, m7g, or r7g)
  • Familiarity with Linux, Docker, and cloud environments

Resources from Arm and our partners

Support Level

This project is designed to be self-serve but comes with opportunity of some community support from Arm Ambassadors, who are part of the Arm Developer program. If you are not already part of our program, click here to join.

Benefits

Standout project contributions will result in preferential internal referrals to Arm Talent Acquisition (with digital badges for CV building). And we are currently discussing with national agencies the potential for funding streams for Arm Developer Labs projects, which would flow to you, not us.

To receive the benefits, you must show us your project through our online form. Please do not include any confidential information in your contribution. Additionally if you are affiliated with an academic institution, please ensure you have the right to share your material.