Description
You will play a key role in designing, implementing, and optimizing ML solutions for highly constrained compute environments. This is a cross-disciplinary role that blends expertise in embedded systems, computer architecture, and machine learning to unlock new applications in areas such as IoT, wearables, robotics, and autonomous systems.
RESPONSIBILITIES:
- Design and implement embedded ML pipelines on microcontrollers and custom SoCs with tight compute, memory, and power constraints.
- Optimize and quantize deep learning models for real-time inference on edge platforms.
- Develop and maintain low-level firmware in C/C++ to integrate ML models with custom hardware accelerators and sensors.
- Conduct performance benchmarking, memory profiling, and bottleneck analysis across various embedded platforms.
- Collaborate closely with ML researchers, hardware architects, and product engineers to co-design efficient ML solutions from model training to deployment.
- Evaluate new edge ML techniques, compilers (e.g., TVM, TFLite Micro, CMSIS-NN), and toolchains to advance the team's capabilities.
- Contribute to the overall system architecture with a deep understanding of embedded compute, memory hierarchies, and data flow optimization.