01 - Project Overview
What This Project Does
ML Intern is an autonomous AI agent specialized in Machine Learning engineering tasks on the Hugging Face ecosystem. It can research papers, find datasets, write training scripts, launch GPU jobs, manage repositories, and iterate on results -- all through natural language conversation.
Think of it as a "junior ML engineer" you can delegate tasks to: "Fine-tune Llama 3 on this dataset and push the model to my Hub repo." The agent will research the right approach, write a training script, test it in a sandbox, launch a GPU job, monitor it, and push the result.
Problem It Solves
ML workflows on Hugging Face involve juggling many services: the Hub API (repos, models, datasets), Inference Endpoints, Spaces, training jobs, documentation across dozens of libraries, and a vast ecosystem of papers and examples. ML Intern unifies all of this behind a single conversational interface, automating the tedious orchestration while keeping the human in the loop for critical decisions (GPU spending, repo modifications, etc.).
Target Users
- ML practitioners who use Hugging Face regularly and want to accelerate their workflow
- Researchers who need to find papers, explore citation graphs, discover datasets, and run experiments
- Developers building on HF infrastructure who want help navigating the ecosystem
The system prompt explicitly accounts for users at different skill levels -- it instructs the agent to research thoroughly before implementing (to avoid hallucinating wrong APIs or outdated patterns), making it useful even for experienced practitioners working outside their comfort zone.
Tech Stack
Backend
| Technology | Version | Purpose |
|---|---|---|
| Python | >= 3.11 | Core language |
| FastAPI | >= 0.115 | REST API + SSE streaming server |
| Uvicorn | >= 0.32 | ASGI server |
| LiteLLM | >= 1.83 | Universal LLM API abstraction (Anthropic, OpenAI, HF Router) |
| FastMCP | >= 3.2 | Model Context Protocol client for HF's MCP server |
| Hugging Face Hub | >= 1.0 | Hub API client (repos, models, datasets, Spaces, jobs) |
| Pydantic | >= 2.12 | Request/response validation |
| httpx | >= 0.27 | Async HTTP client |
| Rich | >= 13.0 | Terminal rendering (markdown, panels, live display) |
| prompt-toolkit | >= 3.0 | Advanced CLI input handling |
| Whoosh | >= 2.7 | Full-text search indexing for HF docs and OpenAPI specs |
| thefuzz | >= 0.22 | Fuzzy string matching for GitHub example discovery |
| nbconvert/nbformat | >= 7.16/5.10 | Jupyter notebook conversion |
| uv | (build tool) | Fast Python package manager, used for deps and lockfile |
Frontend
| Technology | Purpose |
|---|---|
| React 18 + TypeScript | UI framework |
| Vite | Build toolchain and dev server |
Vercel AI SDK (ai + @ai-sdk/react) |
Chat UI abstraction, streaming, tool approval UX |
| MUI (Material UI) | Component library |
| Zustand | State management (3 stores) |
| react-markdown + react-syntax-highlighter | Markdown and code rendering |
Infrastructure
| Technology | Purpose |
|---|---|
| Docker (multi-stage) | Production deployment to HF Spaces |
| HF Spaces | Hosting platform (port 7860, UID 1000) |
| HF OAuth 2.0 | Authentication via HuggingFace accounts |
| HF Inference Router | Model routing with provider selection |
| HF Dataset Repos | Session trajectory persistence |
Why These Choices
- LiteLLM over direct SDKs: Provides a unified
acompletion()interface across Anthropic, OpenAI, and the HF Inference Router. This allows switching models at runtime without code changes. - FastMCP: Hugging Face publishes an official MCP server. FastMCP provides a typed Python client that can discover and call MCP tools dynamically.
- Vercel AI SDK: Provides battle-tested React hooks for streaming chat UIs with tool calling support (including approval workflows), saving significant frontend complexity.
- Zustand over Redux: Lightweight, minimal boilerplate. Three small stores with clear boundaries instead of one monolithic state tree.
- Whoosh: Pure-Python full-text search engine that runs in-process. No external search server needed for indexing HF documentation and OpenAPI specs.
- Rich: The de facto standard for terminal rendering in Python. Used for markdown, panels, theming, and live display.
- uv: Modern Python package manager with lockfile support.
uv sync --frozenensures reproducible builds in Docker.
Dual Interface
The project ships two complete interfaces:
-
CLI (
agent/main.py): Rich terminal UI with particle logo animation, CRT boot sequence, typewriter streaming, live sub-agent tracking, and interactive approval prompts. Invoked viaml-interncommand. -
Web UI (
frontend/+backend/): React SPA with multi-session support, SSE streaming, drag-to-resize code panel, research sub-agent visualization, and script editing for tool approval.
Both interfaces share the same agent core (agent/core/) and tool suite (agent/tools/), connected through an async queue-based architecture.