open source llm swe benchmark