CodeDocs Vault

01 - Project Overview

What This Project Does

ML Intern is an autonomous AI agent specialized in Machine Learning engineering tasks on the Hugging Face ecosystem. It can research papers, find datasets, write training scripts, launch GPU jobs, manage repositories, and iterate on results -- all through natural language conversation.

Think of it as a "junior ML engineer" you can delegate tasks to: "Fine-tune Llama 3 on this dataset and push the model to my Hub repo." The agent will research the right approach, write a training script, test it in a sandbox, launch a GPU job, monitor it, and push the result.

Problem It Solves

ML workflows on Hugging Face involve juggling many services: the Hub API (repos, models, datasets), Inference Endpoints, Spaces, training jobs, documentation across dozens of libraries, and a vast ecosystem of papers and examples. ML Intern unifies all of this behind a single conversational interface, automating the tedious orchestration while keeping the human in the loop for critical decisions (GPU spending, repo modifications, etc.).

Target Users

  1. ML practitioners who use Hugging Face regularly and want to accelerate their workflow
  2. Researchers who need to find papers, explore citation graphs, discover datasets, and run experiments
  3. Developers building on HF infrastructure who want help navigating the ecosystem

The system prompt explicitly accounts for users at different skill levels -- it instructs the agent to research thoroughly before implementing (to avoid hallucinating wrong APIs or outdated patterns), making it useful even for experienced practitioners working outside their comfort zone.

Tech Stack

Backend

Technology Version Purpose
Python >= 3.11 Core language
FastAPI >= 0.115 REST API + SSE streaming server
Uvicorn >= 0.32 ASGI server
LiteLLM >= 1.83 Universal LLM API abstraction (Anthropic, OpenAI, HF Router)
FastMCP >= 3.2 Model Context Protocol client for HF's MCP server
Hugging Face Hub >= 1.0 Hub API client (repos, models, datasets, Spaces, jobs)
Pydantic >= 2.12 Request/response validation
httpx >= 0.27 Async HTTP client
Rich >= 13.0 Terminal rendering (markdown, panels, live display)
prompt-toolkit >= 3.0 Advanced CLI input handling
Whoosh >= 2.7 Full-text search indexing for HF docs and OpenAPI specs
thefuzz >= 0.22 Fuzzy string matching for GitHub example discovery
nbconvert/nbformat >= 7.16/5.10 Jupyter notebook conversion
uv (build tool) Fast Python package manager, used for deps and lockfile

Frontend

Technology Purpose
React 18 + TypeScript UI framework
Vite Build toolchain and dev server
Vercel AI SDK (ai + @ai-sdk/react) Chat UI abstraction, streaming, tool approval UX
MUI (Material UI) Component library
Zustand State management (3 stores)
react-markdown + react-syntax-highlighter Markdown and code rendering

Infrastructure

Technology Purpose
Docker (multi-stage) Production deployment to HF Spaces
HF Spaces Hosting platform (port 7860, UID 1000)
HF OAuth 2.0 Authentication via HuggingFace accounts
HF Inference Router Model routing with provider selection
HF Dataset Repos Session trajectory persistence

Why These Choices

Dual Interface

The project ships two complete interfaces:

  1. CLI (agent/main.py): Rich terminal UI with particle logo animation, CRT boot sequence, typewriter streaming, live sub-agent tracking, and interactive approval prompts. Invoked via ml-intern command.

  2. Web UI (frontend/ + backend/): React SPA with multi-session support, SSE streaming, drag-to-resize code panel, research sub-agent visualization, and script editing for tool approval.

Both interfaces share the same agent core (agent/core/) and tool suite (agent/tools/), connected through an async queue-based architecture.