Schema Reference

Complete documentation for the aigovhub.yaml schema.

Overview

The aigovhub.yaml file is the primary compliance artifact produced by AIGovHub. It documents all AI systems detected in a repository and serves as:

Version-controlled audit trail for AI system inventory
Input for compliance processes (EU AI Act classification)
Machine-readable documentation for automation
CycloneDX-compatible AI-SBoM source

Schema Version

Current schema version: 1.0.0

schema_version: "1.0.0"

Complete Schema

# AIGovHub Artifact Schema v1.0.0
 
schema_version: "1.0.0"           # Required: Schema version
generated_at: "2025-01-15T10:30:00Z"  # Required: ISO 8601 timestamp
generated_by: "aigovhub v0.1.0"   # Required: Tool identifier
 
repository:                       # Required: Repository information
  name: "my-project"              # Required: Repository name
  git_commit: "abc123def"         # Optional: Git commit hash
 
scan_config:                      # Optional: Scan configuration used
  confidence_threshold: 0.7       # Detection threshold (0.0-1.0)
  use_llm_fallback: true          # Whether LLM was enabled
  llm_provider: "anthropic"       # LLM provider used (if any)
 
ai_systems:                       # Required: List of detected AI systems
  - id: "ai-001"                  # Required: Unique identifier
    name: "sentiment-analyzer"    # Required: Human-readable name
    type: "nlp"                   # Required: AI system type
    detection_confidence: 0.95    # Required: Detection confidence (0.0-1.0)
 
    source:                       # Required: Source information
      files:                      # Files where AI detected
        - "src/models/sentiment.py"
      dependencies:               # AI-related dependencies
        - "transformers>=4.30.0"
      model_files:                # Model files found
        - "models/sentiment.pt"
 
    classification:               # Required: Risk classification
      risk_category: null         # EU AI Act risk category
 
    model_card:                   # Optional: CycloneDX model card
      model_parameters:
        architecture_family: "transformer"
        task: "text-classification"
 
    intended_purpose:             # Optional: Intended use
      description: "Classify customer feedback sentiment"
      domain: "customer_service"
 
compliance:                       # Required: Compliance status
  status: "not_evaluated"         # Current compliance status

Field Reference

Root Fields

Field	Type	Required	Description
`schema_version`	string	Yes	Schema version (e.g., "1.0.0")
`generated_at`	string (ISO 8601)	Yes	Generation timestamp
`generated_by`	string	Yes	Tool that generated the file
`repository`	object	Yes	Repository information
`scan_config`	object	No	Scan configuration used
`ai_systems`	array	Yes	List of AI systems (can be empty)
`compliance`	object	Yes	Overall compliance status

`repository` Object

Field	Type	Required	Description
`name`	string	Yes	Repository name
`git_commit`	string	No	Git commit hash (short or full)
`git_branch`	string	No	Git branch name
`git_remote`	string	No	Git remote URL

`scan_config` Object

Field	Type	Required	Default	Description
`confidence_threshold`	float	No	0.7	Detection threshold
`use_llm_fallback`	boolean	No	true	LLM enabled
`llm_provider`	string	No	null	LLM provider name

`ai_systems[]` Array Items

Field	Type	Required	Description
`id`	string	Yes	Unique identifier (e.g., "ai-001")
`name`	string	Yes	Human-readable name
`type`	string	Yes	AI system type (see below)
`detection_confidence`	float	Yes	Confidence score (0.0-1.0)
`source`	object	Yes	Source information
`classification`	object	Yes	Risk classification
`model_card`	object	No	CycloneDX model card
`intended_purpose`	object	No	Intended use description

AI System Types

Value	Description
`ml_model`	Traditional machine learning model
`deep_learning`	Neural network / deep learning system
`llm_integration`	Large Language Model integration
`computer_vision`	Image/video processing AI
`nlp`	Natural language processing
`reinforcement_learning`	Reinforcement learning agent
`autonomous_agent`	Autonomous AI agent
`recommendation_system`	Recommendation engine
`predictive_analytics`	Predictive analytics system
`unknown`	Unclassified AI system

`source` Object

Field	Type	Required	Description
`files`	array[string]	No	Source files (relative paths)
`dependencies`	array[string]	No	AI-related package dependencies
`model_files`	array[string]	No	Model file paths

`classification` Object

Field	Type	Required	Description
`risk_category`	string	Yes	EU AI Act risk category

Risk Categories:

Value	Description
`null`	Not yet evaluated
`prohibited`	Prohibited AI systems (Article 5)
`high_risk`	High-risk AI systems (Annex III)
`limited_risk`	Limited risk with transparency requirements
`minimal_risk`	Minimal/no risk AI systems

`model_card` Object

Based on CycloneDX ML extension:

Field	Type	Required	Description
`model_parameters`	object	No	Model parameters
`model_parameters.architecture_family`	string	No	e.g., "transformer", "cnn"
`model_parameters.task`	string	No	e.g., "text-classification"

`intended_purpose` Object

Field	Type	Required	Description
`description`	string	No	Description of intended use
`domain`	string	No	Application domain

`compliance` Object

Field	Type	Required	Description
`status`	string	Yes	Compliance evaluation status

Status Values:

Value	Description
`not_evaluated`	Not yet evaluated for compliance
`in_progress`	Compliance evaluation in progress
`compliant`	Meets all requirements
`non_compliant`	Does not meet requirements
`not_applicable`	AI Act does not apply

Examples

Minimal Artifact

schema_version: "1.0.0"
generated_at: "2025-01-15T10:30:00Z"
generated_by: "aigovhub v0.1.0"
 
repository:
  name: "my-project"
 
ai_systems: []
 
compliance:
  status: "not_evaluated"

Single AI System

schema_version: "1.0.0"
generated_at: "2025-01-15T10:30:00Z"
generated_by: "aigovhub v0.1.0"
 
repository:
  name: "sentiment-service"
  git_commit: "abc123d"
 
ai_systems:
  - id: "ai-001"
    name: "sentiment-classifier"
    type: "nlp"
    detection_confidence: 1.0
    source:
      files:
        - "src/classifier.py"
      dependencies:
        - "transformers>=4.30.0"
        - "torch>=2.0.0"
      model_files:
        - "models/sentiment.pt"
    classification:
      risk_category: "minimal_risk"
    model_card:
      model_parameters:
        architecture_family: "transformer"
        task: "sentiment-analysis"
    intended_purpose:
      description: "Analyze sentiment of customer reviews"
      domain: "customer_service"
 
compliance:
  status: "compliant"

Multiple AI Systems

schema_version: "1.0.0"
generated_at: "2025-01-15T10:30:00Z"
generated_by: "aigovhub v0.1.0"
 
repository:
  name: "ai-platform"
  git_commit: "def456"
 
scan_config:
  confidence_threshold: 0.7
  use_llm_fallback: true
  llm_provider: "anthropic"
 
ai_systems:
  - id: "ai-001"
    name: "document-classifier"
    type: "nlp"
    detection_confidence: 1.0
    source:
      files:
        - "src/document_classifier.py"
      dependencies:
        - "transformers>=4.30.0"
      model_files: []
    classification:
      risk_category: "minimal_risk"
    intended_purpose:
      description: "Classify internal documents by topic"
      domain: "document_management"
 
  - id: "ai-002"
    name: "chatbot-integration"
    type: "llm_integration"
    detection_confidence: 0.95
    source:
      files:
        - "src/chatbot.py"
        - "src/prompts.py"
      dependencies:
        - "anthropic>=0.40.0"
        - "langchain>=0.1.0"
      model_files: []
    classification:
      risk_category: "limited_risk"
    intended_purpose:
      description: "Customer support chatbot"
      domain: "customer_service"
 
  - id: "ai-003"
    name: "face-recognition"
    type: "computer_vision"
    detection_confidence: 0.92
    source:
      files:
        - "src/biometrics/face.py"
      dependencies:
        - "opencv-python>=4.8.0"
        - "face-recognition>=1.3.0"
      model_files:
        - "models/face_encoder.dat"
    classification:
      risk_category: "high_risk"
    intended_purpose:
      description: "Employee access control"
      domain: "biometrics"
 
compliance:
  status: "in_progress"

Validation

Use the CLI to validate your artifact:

# Basic validation
aigovhub validate aigovhub.yaml
 
# Strict validation (warnings become errors)
aigovhub validate aigovhub.yaml --strict
 
# Validate against specific schema version
aigovhub validate aigovhub.yaml --schema-version 1.0.0

Migration Guide

From Pre-1.0 to 1.0.0

The 1.0.0 schema is the first stable version. No migration needed.

Future Versions

Schema versions follow semantic versioning:

Patch (1.0.x): Backwards compatible clarifications
Minor (1.x.0): Backwards compatible additions
Major (x.0.0): Breaking changes (with migration guide)