Spike: Investigate Server Side Code Execution with Judge0

### 🧠 Context

[Judge0](https://judge0.com/) is an open-source code execution service that runs submissions in isolated containers on a server. Unlike the Pyodide and Java spikes, this approach requires infrastructure that the society would need to set up and maintain. The upside is that it can run any language without browser compatibility concerns. The downside is operational complexity for a volunteer-run project.

This spike figures out what it actually takes to get Judge0 running, how it handles Python and Java questions, and whether the operational burden is realistic. The repo has example questions and sample submissions at `questions/` and `submissions/`. See `SCHEMA.md` for the question format and test file conventions.

### 🎯 What we're hoping to do

- Self-host a Judge0 instance.
- Submit a student's Python or Java submission alongside a test file.
- Get back pass/fail results matching the JSON format in `SCHEMA.md`.
- Understand what it takes to keep this running reliably.

### 🔍 What to look into

Work through the following questions and record your findings.

1. **Setup complexity.** What does self-hosting Judge0 actually involve? Walk through what you did: prerequisites, how long it took, and what was confusing or poorly documented. Also note whether the public hosted API is a realistic alternative — cover any rate limits, cost, or data concerns.

2. **Infrastructure requirements.** What kind of server does Judge0 need to run comfortably? What happens during a busy period — say, 100 students submitting simultaneously during a lab? What does it mean for a volunteer society to own and operate this long-term: keeping it running, handling updates, dealing with abuse, cost of hosting?

3. **End-to-end latency.** From submission sent to result received, how long does it take? Break it down: queue wait, execution time, API round-trip. What's the experience like for a student waiting on feedback?

4. **Question and submission workflow.** How does the full loop work: we write a test file, a student submits code, and the result comes back as structured pass/fail? Judge0 accepts a single source file per submission — how do you get both the student's code and the test file in there? Note any friction with the `SCHEMA.md` conventions.

5. **Time and memory limits.** How are limits configured — per submission at request time, or set platform-wide? How does Judge0 surface timeouts and memory errors back through the API?

6. **Python and Java support.** Confirm both languages work end-to-end. Run `submissions/add-two-numbers/correct.py` and `submissions/add-two-numbers-java/correct.java` through your instance and verify you get the expected output. Note any differences in how the two languages behave.

### 🖥 How to demonstrate it

Get a Judge0 instance running locally (Docker is fine) or use the public API for the demo.

The demo should:
1. Submit `submissions/add-two-numbers/correct.py` with the test file `questions/add-two-numbers/tests.py` to Judge0.
2. Poll for or receive the result.
3. Print the JSON output and a plain-text pass/fail verdict.
4. Repeat with `wrong-answer.py` and `broken.py` to confirm the failure and error cases surface correctly.
5. Repeat with the Java question (`submissions/add-two-numbers-java/`) to confirm Java works end-to-end.

A simple script (Python, shell, or Node.js) hitting the Judge0 API is fine. No UI needed.

Include a short written document covering your answers to the questions above, the actual latency numbers you measured, and a frank assessment of what it would take to keep this running as a volunteer-run service.

### ✅ Is it usable for our case?

Determine yes, no, or yes with caveats. If caveats, list the important ones. This is the main thing the team needs from this ticket.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spike: Investigate Server Side Code Execution with Judge0 #3

🧠 Context

🎯 What we're hoping to do

🔍 What to look into

🖥 How to demonstrate it

✅ Is it usable for our case?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Spike: Investigate Server Side Code Execution with Judge0 #3

Description

🧠 Context

🎯 What we're hoping to do

🔍 What to look into

🖥 How to demonstrate it

✅ Is it usable for our case?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions