AI Research Engineer (Model Compression & Quantization)

Tether.io

Remote, Remote, Brazil Full-time May 25, 2026

Opportunity Description

About the Job As a member of our AI research team, you will drive innovation in model compression and efficient deployment for advanced multimodal AI systems, including large language models (LLMs) and vision‑language models (VLMs). Your work will focus on reducing model footprint and computational cost while preserving accuracy, enabling high‑performance AI to run efficiently across resource‑constrained edge devices. You will apply and advance compression techniques such as quantization, knowledge distillation, and pruning to streamline complex multimodal architectures that integrate text, images, and audio. 
We expect you to have deep expertise in model compression methods and a strong background in multimodal model architectures. You will adopt a hands‑on, research‑driven approach to develop, test, and implement novel compression strategies that balance model size, latency, throughput, and accuracy. Your responsibilities include building robust compression pip...
        

Full-time Arquitetura e design de software

Interested in this opportunity? Apply now through Expertini.

Apply for this Position

Location Remote, Remote

Country Brazil

Type Full-time

Category Arquitetura e design de software

Posted May 25, 2026

Deadline July 04, 2026

AI Research Engineer (Model Compression & Quantization)

Opportunity Description

About the Job

Opportunity Details

About Tether.io

Tether.io