Detection Guide
This guide explains how AIGovHub detects AI systems in your codebase.
Detection Philosophy
AIGovHub follows these principles:
- Deterministic First: Use reliable, deterministic signals before LLM analysis
- High Recall: Prioritize finding all AI systems (false negatives are worse than false positives)
- Explainable: Every detection includes evidence for human review
- Conservative LLM Use: Only use LLM for genuinely ambiguous cases
Detection Pipeline
┌─────────────────┐
│ Repository │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Parse Dependencies │ ← requirements.txt, pyproject.toml, setup.py
└────────┬────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Signal Detectors │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Library │ │ Model │ │ API │ │ Code │ │
│ │ Signal │ │ File │ │ Usage │ │ Pattern │ │
│ │ │ │ Signal │ │ Signal │ │ Signal │ │
│ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ │
│ │ │ │ │ │
│ └──────────┬──┴─────────────┴──┬──────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ Aggregator │───▶│ Need LLM? │ │
│ └─────────────┘ └──────┬──────┘ │
│ │ │
│ ┌─────────┼─────────┐ │
│ │ Yes │ No │ │
│ ▼ │ ▼ │
│ ┌───────────┐ │ ┌───────────┐ │
│ │ LLM │ │ │ AI │ │
│ │ Analysis │────┴─▶│ Systems │ │
│ └───────────┘ └───────────┘ │
└─────────────────────────────────────────────────────────────┘
Signal Types
1. Library Signal (Priority: 1, Confidence: DEFINITIVE)
Detects AI systems based on Python package dependencies.
What it checks:
requirements.txtpyproject.tomldependenciessetup.pyinstall_requires
Detected Libraries (100+ packages):
| Category | Examples |
|---|---|
| Deep Learning | tensorflow, pytorch, torch, keras, jax |
| Machine Learning | scikit-learn, xgboost, lightgbm, catboost |
| NLP | transformers, spacy, nltk, sentence-transformers |
| Computer Vision | opencv-python, torchvision, detectron2, ultralytics |
| LLM Frameworks | langchain, llama-index, haystack, autogen |
| LLM API Clients | anthropic, openai, cohere, mistralai |
| Vector Databases | chromadb, pinecone, weaviate-client, qdrant-client |
| MLOps | mlflow, wandb, dvc, bentoml |
| Reinforcement Learning | stable-baselines3, gymnasium |
Example Detection:
signal:
source: library
confidence: 1.0 (definitive)
ai_type: nlp
evidence:
- "ML library: transformers"
- "ML library: torch"
dependencies:
- transformers
- torch2. Model File Signal (Priority: 2, Confidence: HIGH)
Detects AI systems based on model file presence.
Detected File Extensions:
| Extension | Format | Typical Use |
|---|---|---|
.pt, .pth |
PyTorch | Neural network weights |
.h5, .keras |
TensorFlow/Keras | Keras models |
.onnx |
ONNX | Cross-framework models |
.safetensors |
SafeTensors | HuggingFace models |
.pkl, .joblib |
Pickle/Joblib | scikit-learn models |
.pb |
TensorFlow | SavedModel format |
.gguf, .ggml |
GGML | Local LLM models |
.mlmodel |
CoreML | Apple ML models |
.engine, .trt |
TensorRT | NVIDIA optimized models |
Model Directories:
Files in directories named models/, weights/, checkpoints/, pretrained/ get higher confidence.
Example Detection:
signal:
source: model_file
confidence: 0.9 (high)
ai_type: deep_learning
evidence:
- "Model file: models/classifier.pt"
files:
- models/classifier.pt
metadata:
model_formats: ["pytorch"]
in_model_directory: "true"3. API Usage Signal (Priority: 4, Confidence: MEDIUM-HIGH)
Detects AI API usage patterns in code.
Detected API Endpoints:
| Provider | API Pattern |
|---|---|
| OpenAI | api.openai.com |
| Anthropic | api.anthropic.com |
| Cohere | api.cohere.ai |
generativelanguage.googleapis.com |
|
| Mistral | api.mistral.ai |
| Replicate | api.replicate.com |
| Together | api.together.xyz |
| Groq | api.groq.com |
| Perplexity | api.perplexity.ai |
Example Detection:
signal:
source: api_usage
confidence: 0.85 (high)
ai_type: llm_integration
evidence:
- "API pattern: api.openai.com"
files:
- src/chat.py4. Code Pattern Signal (Priority: 5, Confidence: MEDIUM)
Detects ML patterns in Python code.
Detected Patterns:
| Pattern | Description |
|---|---|
model.fit() |
Model training |
model.predict() |
Model inference |
model.train() |
PyTorch training mode |
model.eval() |
PyTorch evaluation mode |
.to('cuda') |
GPU transfer |
torch.load() |
PyTorch model loading |
from_pretrained() |
HuggingFace model loading |
AutoModel. |
HuggingFace auto classes |
pipeline() |
HuggingFace pipelines |
Example Detection:
signal:
source: code_pattern
confidence: 0.7 (medium)
ai_type: ml_model
evidence:
- "Code pattern: model.fit()"
- "Code pattern: from_pretrained()"
files:
- src/train.py5. LLM Analysis Signal (Priority: 6, Confidence: VARIABLE)
Uses LLM to analyze ambiguous code when deterministic signals are insufficient.
When LLM is triggered:
- Some signals exist but confidence < threshold
- Only code patterns found (no library/model signals)
- Confidence is in "maybe" zone (0.3 - threshold)
Not triggered when:
- Definitive signals found (library detection)
- Confidence already above threshold
--no-llmflag used
Example Detection:
signal:
source: llm_analysis
confidence: 0.75 (medium)
ai_type: ml_model
evidence:
- "Custom neural network implementation"
- "Manual backpropagation code"
metadata:
reasoning: "Code implements gradient descent manually without ML libraries"
llm_provider: "anthropic"Signal Aggregation
Multiple signals are combined using weighted aggregation:
Weights
| Signal Source | Weight |
|---|---|
| Library | 1.0 |
| Model File | 0.9 |
| API Usage | 0.8 |
| LLM Analysis | 0.7 |
| Code Pattern | 0.6 |
Aggregation Logic
- Definitive signal present? → Confidence = 100%
- Multiple signals? → Weighted average + bonus (up to +20% for multiple agreeing signals)
- Below threshold? → Not reported as AI system
Example Aggregation
Signals:
- Library (transformers): 1.0 × 1.0 = 1.0
- Model File (.pt): 0.9 × 0.9 = 0.81
- Code Pattern: 0.7 × 0.6 = 0.42
Weighted Average: (1.0 + 0.81 + 0.42) / (1.0 + 0.9 + 0.6) = 0.89
Bonus for 3 signals: +0.1
Final Confidence: 0.99 → HIGH
AI System Type Classification
When aggregating signals, the system infers the AI type:
| Priority | Category | Resulting Type |
|---|---|---|
| 1 | GGUF/GGML files | llm_integration |
| 2 | LLM framework packages | llm_integration |
| 3 | Deep learning libraries | deep_learning |
| 4 | NLP libraries | nlp |
| 5 | Computer vision libraries | computer_vision |
| 6 | RL libraries | reinforcement_learning |
| 7 | General ML libraries | ml_model |
Understanding Detection Results
Confidence Levels
Confidence levels are automatically computed from the numeric score:
| Level | Score Range | Interpretation |
|---|---|---|
| DEFINITIVE | 100% | Confirmed AI system - ML library dependency found |
| HIGH | 80-99% | Very likely AI system - strong evidence |
| MEDIUM | 50-79% | Probable AI system - review recommended |
| LOW | 30-49% | Possible AI system - manual review needed |
| UNCERTAIN | <30% | Unlikely - not reported unless threshold lowered |
The level is derived from the score using these thresholds. When signals are aggregated, the final confidence score determines the reported level.
Reading the Output
ai_systems:
- id: "ai-001"
name: "transformers-nlp" # Auto-generated from evidence
type: "nlp" # Inferred from signals
detection_confidence: 0.95 # Aggregated score
source:
files: # Files where AI detected
- "src/model.py"
dependencies: # AI-related dependencies
- "transformers>=4.30.0"
model_files: # Model files found
- "models/bert.safetensors"Tuning Detection
Lower False Negatives (Find More)
aigovhub scan . --confidence 0.5 --llmLower False Positives (More Certain)
aigovhub scan . --confidence 0.9 --no-llmFast Scanning (CI/CD)
aigovhub scan . --no-llmThorough Analysis
aigovhub scan . --confidence 0.5 --llm --verboseLimitations
- Language Support: Currently Python-focused
- Dynamic Analysis: No runtime analysis, only static code inspection
- Proprietary Code: May miss AI systems in compiled dependencies
- Novel Patterns: New AI frameworks may not be recognized immediately
- False Positives: Libraries with ML-like names may be flagged
Extending Detection
See API Reference for creating custom signal detectors.