Python Infrastructure Developer/Intern @ BenchFlow

BenchFlow

Python Infrastructure Developer/Intern

San Francisco/Remoteinternship

About the role

About Us

BenchFlow is creating a unified runtime for AI benchmarks. We host the largest library of rigorously designed evaluations (e.g., CMU WebArena, coding agents) and enable enterprises to run them via API.

Role

Design scalable systems to execute benchmarks on cloud infrastructure
Optimize runtime performance for Python/Node.js-based AI workflows
Build tools for tracing, logging, and dynamic leaderboards

Requirements

Proficiency in Python, async programming, and cloud platforms (AWS/GCP)
Experience with distributed systems or developer tools (e.g., CI/CD)
[Intern] Current student/recent grad in CS/Engineering

Nice to Have

Knowledge of AI agent frameworks (LangChain, LlamaIndex)
Familiarity with benchmarks like SWE-bench, WebArena

Salary and Perks

Full-time Bay Area base $130k-170k/yr, 0.5%-1.5% equity; internship base Bay Area $6k/mo
Full-time remote salary negotiable; internship remote ¥13k-¥24k
Participate in developing open-source tools used by thousands of developers
Full-time positions receive equity in a rapidly growing AI infrastructure company