01 - Project Overview

What This Project Does

ML Intern is an autonomous AI agent specialized in Machine Learning engineering tasks on the Hugging Face ecosystem. It can research papers, find datasets, write training scripts, launch GPU jobs, manage repositories, and iterate on results -- all through natural language conversation.

Think of it as a "junior ML engineer" you can delegate tasks to: "Fine-tune Llama 3 on this dataset and push the model to my Hub repo." The agent will research the right approach, write a training script, test it in a sandbox, launch a GPU job, monitor it, and push the result.

Problem It Solves

ML workflows on Hugging Face involve juggling many services: the Hub API (repos, models, datasets), Inference Endpoints, Spaces, training jobs, documentation across dozens of libraries, and a vast ecosystem of papers and examples. ML Intern unifies all of this behind a single conversational interface, automating the tedious orchestration while keeping the human in the loop for critical decisions (GPU spending, repo modifications, etc.).

Target Users

ML practitioners who use Hugging Face regularly and want to accelerate their workflow
Researchers who need to find papers, explore citation graphs, discover datasets, and run experiments
Developers building on HF infrastructure who want help navigating the ecosystem

The system prompt explicitly accounts for users at different skill levels -- it instructs the agent to research thoroughly before implementing (to avoid hallucinating wrong APIs or outdated patterns), making it useful even for experienced practitioners working outside their comfort zone.

Tech Stack

Backend

Technology	Version	Purpose
Python	>= 3.11	Core language
FastAPI	>= 0.115	REST API + SSE streaming server
Uvicorn	>= 0.32	ASGI server
LiteLLM	>= 1.83	Universal LLM API abstraction (Anthropic, OpenAI, HF Router)
FastMCP	>= 3.2	Model Context Protocol client for HF's MCP server
Hugging Face Hub	>= 1.0	Hub API client (repos, models, datasets, Spaces, jobs)
Pydantic	>= 2.12	Request/response validation
httpx	>= 0.27	Async HTTP client
Rich	>= 13.0	Terminal rendering (markdown, panels, live display)
prompt-toolkit	>= 3.0	Advanced CLI input handling
Whoosh	>= 2.7	Full-text search indexing for HF docs and OpenAPI specs
thefuzz	>= 0.22	Fuzzy string matching for GitHub example discovery
nbconvert/nbformat	>= 7.16/5.10	Jupyter notebook conversion
uv	(build tool)	Fast Python package manager, used for deps and lockfile

Frontend

Technology	Purpose
React 18 + TypeScript	UI framework
Vite	Build toolchain and dev server
Vercel AI SDK (`ai` + `@ai-sdk/react`)	Chat UI abstraction, streaming, tool approval UX
MUI (Material UI)	Component library
Zustand	State management (3 stores)
react-markdown + react-syntax-highlighter	Markdown and code rendering

Infrastructure

Technology	Purpose
Docker (multi-stage)	Production deployment to HF Spaces
HF Spaces	Hosting platform (port 7860, UID 1000)
HF OAuth 2.0	Authentication via HuggingFace accounts
HF Inference Router	Model routing with provider selection
HF Dataset Repos	Session trajectory persistence

Why These Choices

LiteLLM over direct SDKs: Provides a unified acompletion() interface across Anthropic, OpenAI, and the HF Inference Router. This allows switching models at runtime without code changes.
FastMCP: Hugging Face publishes an official MCP server. FastMCP provides a typed Python client that can discover and call MCP tools dynamically.
Vercel AI SDK: Provides battle-tested React hooks for streaming chat UIs with tool calling support (including approval workflows), saving significant frontend complexity.
Zustand over Redux: Lightweight, minimal boilerplate. Three small stores with clear boundaries instead of one monolithic state tree.
Whoosh: Pure-Python full-text search engine that runs in-process. No external search server needed for indexing HF documentation and OpenAPI specs.
Rich: The de facto standard for terminal rendering in Python. Used for markdown, panels, theming, and live display.
uv: Modern Python package manager with lockfile support. uv sync --frozen ensures reproducible builds in Docker.

Dual Interface

The project ships two complete interfaces:

CLI (agent/main.py): Rich terminal UI with particle logo animation, CRT boot sequence, typewriter streaming, live sub-agent tracking, and interactive approval prompts. Invoked via ml-intern command.
Web UI (frontend/ + backend/): React SPA with multi-session support, SSE streaming, drag-to-resize code panel, research sub-agent visualization, and script editing for tool approval.

Both interfaces share the same agent core (agent/core/) and tool suite (agent/tools/), connected through an async queue-based architecture.