Job summary
This role involves creating and refining coding test cases for AI systems, focusing on validating end-to-end behaviors and complex reasoning tasks. Contributors will work on a project basis, allowing for flexible hours and remote work.
Qualifications
- Degree in Computer Science, Software Engineering, or a related field
- Over 5 years of experience in software development, primarily using Python (pytest, async/await, subprocess, file operations)
- Background in Full-Stack development, with experience in both React-based interfaces and robust Back-end systems
- Proven experience in writing functional and integration tests
- Familiarity with Docker containers for local evaluations
- Understanding of CI/CD processes, particularly with GitHub Actions
- Proficiency in English at a B2 level
Responsibilities
Compensation
Contributors can earn up to $50 per hour, depending on their expertise and contribution pace. Compensation may vary across different projects based on their complexity and requirements.
- Review and refine realistic coding tasks based on production codebases
- Write comprehensive functional tests that validate actual behaviors and edge cases
- Create challenging coding tasks that require complex reasoning and information gathering
- Analyze AI failures to identify strengths and weaknesses of the model
- Iterate on tasks based on feedback from QA reviewers who evaluate work on quality criteria