About ADA-IG
How this platform is built, what every file does, and every API call documented.
What ADA-IG is
ADA-IG (Accessible Data Archive — Instagram) is a self-hosted, read-only Instagram data ingestion and monitoring platform. It connects to Instagram via the official Meta Graph API for authorized Business or Creator accounts, periodically ingests profile, post, and comment data, tracks field-level changes over time, and exposes everything through a citation-first search interface and a clean operator dashboard.
Every record is traceable back to the raw source artifact and the run that produced it. No data is ever invented, inferred beyond what the API returns, or collected through unsupported means. The platform is intended to give operators a reliable, auditable window into account activity — nothing more.
What it is not
- Not a scraper — data comes from the official Meta Graph API by default
- Not a publisher — no writes back to Instagram (no posts, DMs, comments, moderation)
- Not a social graph expander — no follower harvesting or friends-of-friends collection
- Not an automation tool — the platform monitors; it does not act
Tech stack
Design constraints
Service architecture
All services run in Docker Compose. Each has a single, bounded responsibility.
Data flow
POST /api/runs/manual-demo), inbound OpenClaw webhook (POST /api/hooks/openclaw/run), or the worker bootstrap loop.storage/artifacts/ and recorded as SourceArtifact rows in the database.pipeline.py calls normalizer.py which maps the raw artifact payload to Profile, Post, and Comment ORM records, creating new versioned snapshots for each.diff_engine.py compares the new snapshot against the previous one for each entity, writing EntityDiff records for every changed field (old value, new value, run reference).chunker.py splits text fields (bio, captions, comments) into overlapping chunks. embedder.py converts each chunk to a vector and writes EmbeddingChunk rows for pgvector indexing.EmbeddingChunk. Every result carries provenance: run_id, artifact_id, and entity_id.File structure
API reference
Every endpoint this platform exposes. Click a group to expand.
Health & readiness 2 endpoints
{"status":"ok"}. Used by Docker and load balancers to confirm the process is running.
{"status":"ok"} or an error with the reason.
Ingestion runs 3 endpoints
Profiles 1 endpoint
Posts 1 endpoint
Comments 1 endpoint
Changes (diffs) 1 endpoint
Search 1 endpoint
EmbeddingChunk vectors using pgvector cosine similarity. Every result includes run_id, artifact_id, and entity_id for full provenance.
Replay 1 endpoint
Automation hooks 1 endpoint
X-OpenClaw-Secret header to match the configured OPENCLAW_WEBHOOK_SECRET value. Returns the created run ID.
Settings 2 endpoints
.env. Accepts a multipart form body with all platform config fields. On success, redirects to /settings?saved=1.
Operator 7 endpoints
connected, ready_for_auth, needs_credentials), whether keys are present, auth state, and the latest run event.
/settings?action=seeded.
/settings?action=tested.
/settings?action=restart-needed to display the restart command. Actual restart must be done from the host shell.
.env. Requires the exact confirmation phrase in the form body. Redirects to /settings?action=wiped on success or ?action=wipe-error on phrase mismatch.
UI pages 4 pages
Environment variables
| Variable | Purpose | Default |
|---|---|---|
| DATABASE_URL | PostgreSQL DSN for SQLAlchemy | — |
| ARTIFACT_ROOT | Local path where raw JSON artifacts are written | /data/artifacts |
| META_APP_ID | Meta developer app ID for Instagram Graph API | — |
| META_APP_SECRET | Meta developer app secret | — |
| META_REDIRECT_URI | OAuth callback URL registered in the Meta app | — |
| OPENCLAW_WEBHOOK_SECRET | Shared secret for the inbound automation hook | — |
| ENCRYPTION_KEY | Key for future stored token/secret protection | — |
| EMBEDDING_MODEL | Embedder to use for vector search | dev-hash-embedder |
| PUBLIC_CAPTURE_ENABLED | Enables Playwright public capture scaffold | false |
| LOG_LEVEL | Uvicorn and app log verbosity | INFO |
| APP_ENV | Deployment environment label | dev |
Version
| Version | V1.1 |
| Scope | Read-only — ingest, monitor, search, diff |
| API lane | Meta Instagram Graph API (official) |
| Auth status | Meta OAuth flow not yet implemented — pending V1.2 |
| Public capture | Scaffold present, disabled by default |
| Search | pgvector cosine similarity, dev-hash-embedder in dev mode |