Figure’s vision is to deploy autonomous humanoids at a global scale. Our AI team is looking for an experienced Distributed Systems Engineer to take our AI infrastructure to the next level. This role is focused on building AI training and deployment, behavior evaluation, and data management software. The ideal candidate has experience building tools and infrastructure for a large-scale deep learning system.
Responsibilities:
- Design and implement software tools used to collect and manage data, train deep neural networks and deploy them on humanoid robots
- Collaborate with our AI and robotics engineers to identify software requirements and take the lead on implementing them
- Run and maintain reliable backend distributed systems at scale
- Collaborate with our customers on defining and executing the software strategy for deploying humanoid robots in production
Requirements:
- Bachelor's or Master's degree in Computer Science, Robotics, Engineering, or a related field
- Experience with Python and an ML framework (PyTorch, JAX, TensorFlow, etc.
- Minimum of 4 years of professional, full-time experience building reliable backend systems
- Experience with Linux and command line tools
- Experience using and managing data stores (Postgres, MySQL, ElasticSearch, Redis, etc.)
Bonus Qualifications:
- Experience managing HPC clusters for deep neural network training
- Experience managing cloud infrastructure (AWS, Azure, GCP)
- Experience with job scheduling / orchestration tools (SLURM, Kubernetes, LSF, etc.)
- Experience with configuration management tools (Ansible, Terraform, Puppet, Chef, etc.)
- Experience building data annotation and dataset management tools