Trust & Due Diligence
For the people doing the homework.
A consolidated reference for government CIO, CISO, records, and procurement teams conducting technical due diligence on Metadata Minder. Every section maps to a question we have already been asked by an evaluating agency.
01 · Architecture
A bounded discovery pipeline. No black boxes.
Metadata Minder is a containerized pipeline composed of well-known, auditable open-source components — orchestrated by Rietta's hardened glue code. Every stage is replaceable, inspectable, and runs inside your perimeter.
Ingest
Filesystem mount, S3-compatible bucket, or SMB
Normalize
LibreOffice · pdftotext · Tesseract OCR
Extract
exiftool · OOXML parsers · custom WPD handlers
Inference
Ollama (local LLM) — pluggable model layer
Index & store
Postgres + object storage (host-controlled)
Surface
Web dashboard with RBAC and audit log
We can provide an architecture diagram and component inventory under MNDA on request.
02 · Deployment Topologies
Three supported postures.
Posture A
Fully on-premises via Docker
The entire pipeline runs in containers on your hardware. No SaaS dependency, no callback URLs. Air-gapped deployments with no upstream telemetry are supported.
Posture B
AWS GovCloud (US)
For agencies standardized on AWS GovCloud, Metadata Minder deploys natively into your existing FedRAMP-aligned environment. We do not introduce additional public-internet egress.
Posture C
Rietta-Managed Lab (Pilot Phase)
Managed lab environment operated by Rietta. Ideal for pilot-phase evaluation without requiring on-prem infrastructure.
03 · Data Handling & Residency
Document content is never sent to external AI providers
- No external data transfer by default. The deployment ships with outbound network access to public AI providers disabled.
- Local LLM inference via Ollama. Document content is never sent to OpenAI, Anthropic, Google, or any third-party LLM.
- Encryption. Transport is TLS 1.2+. At-rest encryption is provided by the host filesystem or AWS GovCloud KMS — keys stay with you.
- Data residency. All processing occurs on agency-controlled infrastructure. No cross-border replication.
- Retention & disposal. Configurable per jurisdiction. Findings reports and source artifacts can be purged on a defined schedule with cryptographic deletion confirmation.
04 · Identity & Access
Role-based access, scoped to your directory.
Pilot Phase 2
During the pilot phase, access is managed directly by Rietta.
Authentication
SSO via SAML 2.0 / OIDC
Integrates with Microsoft Entra ID, Okta, and Active Directory Federation Services. MFA enforcement is delegated to your identity provider. Planned for Pilot Phase 2.
Authorization
Role-based, least-privilege
Distinct roles for Records Officer, ADA Coordinator, General Counsel, and IT Administrator. Per-archive scoping is enforced at the data layer. Planned for Pilot Phase 2.
05 · Software Supply Chain
Every dependency tracked, signed, and scanned.
- SBOM on request in CycloneDX format for each release.
- Signed container images. Releases are published with verifiable signatures so your IT team can validate provenance before promotion.
- Continuous dependency scanning. Third-party packages are tracked against published CVE feeds; patches are scheduled to outrun disclosed vulnerabilities, not chase them.
- Reproducible builds. The same containerized build process runs on developer machines, in CI, and in production.
06 · Secure SDLC
The same discipline that ships HIPAA-covered SaaS.
Metadata Minder inherits Rietta's appsec practice in full: mandatory security-oriented code review on every change, branch-protected merges, automated test gates, and a Continuous Blue Team relationship between engineering and security review.
Process
Containerized CI/CD
Reproducibility is the foundation of patchability. Every build is rebuildable from source for at least the support window of the release.
Practice
Continuous Blue Team
A standing pairing with internal security review, not a one-time pen test. Findings are fed directly back into the pipeline.
07 · Vulnerability Management
4-hour CVE patch cycle vs. 43-day industry median, Verizon 2026 DBIR.
Rietta has spent over two decades patching production systems within hours of disclosed vulnerabilities. We extend the same SLAs to Metadata Minder deployments, with optional managed-update channels for agencies that want patches applied on our cadence rather than theirs.
- Critical CVE response: patch released within 72 hours of public disclosure when exploitable.
- High CVE response: patch released within 14 days.
- Coordinated disclosure: [email protected] is monitored. PGP-encrypted channel available on request.
08 · Logging, Monitoring & Audit
Every action is attributable. Every finding is reproducible.
- Append-only audit log of authentication, document access, finding generation, and report export — exportable to your SIEM (Splunk, Elastic, Sentinel) via syslog or JSON.
- Reproducible findings. Each report is keyed to a discovery-pass ID; rerunning the pass against the same archive snapshot reproduces the same findings.
- Health and performance metrics exposed via Prometheus-compatible endpoints, scoped to your monitoring stack.
09 · Incident Response
A documented runbook, not a war-room scramble.
Severity classification, communication SLAs, and customer-side containment steps are documented in the Operations Manual provided with each deployment. Rietta engineers are reachable directly — not through a tier-1 ticket queue — for any production-impacting incident.
10 · Compliance Posture
Aligned to the standards your auditor is going to ask about.
ADA Title II
Direct subject matter — discovery output is structured for remediation prioritization.
WCAG 2.1 AA
Accessibility analysis aligned to AA success criteria.
PDF/UA
Tag, structure, and reading-order analysis.
FOIA / State PRA
Hidden-metadata exposure mapping for pre-release review.
AWS GovCloud (US)
Native deployment posture; inherits FedRAMP-High control environment of the host.
NIST SP 800-53
Control mapping available on request for AC, AU, CM, IA, RA, SC, SI families.
HIPAA experience
Rietta operates as a Business Associate for HIPAA-covered SaaS — same engineering bar applies here.
CJIS-aware
Architecture supports CJIS Security Policy alignment for jurisdictions that require it.
Metadata Minder itself is not currently FedRAMP-authorized. Deployment inside your existing FedRAMP boundary (such as AWS GovCloud) inherits that boundary's controls.
11 · Crawler Ethics
MetaDataCrawler/1.0 — disclosed, scoped, polite.
Our crawler honors robots.txt, contains scope to authorized URIs, uses HEAD requests for change detection, and limits re-examination to roughly once per calendar month with non-predictable timing. Full disclosure on the dedicated crawler page.
12 · People & Organization
Who is actually behind the keyboard.
Parent organization
Rietta, Inc.
Founded 1999 in Alpharetta, Georgia. Security-first custom software firm. 27+ years operating on the public internet without losing a customer to a breach.
Engineering leadership
Frank Rietta — CEO
Master of Science in Information Security, Georgia Institute of Technology. Court-recognized expert witness in software. OWASP lifetime member.
References from existing public-sector and HIPAA-covered customers available under MNDA during procurement.
Next step
We expect tough questions. We've answered them before.
Bring your CIO, CISO, records officer, and counsel. We'll walk through the architecture, the threat model, and the deployment options on a single call.