Theoretical Foundations
Welcome to the curriculum workspace. Here you will find long-form technical guidelines outlining core architectural blueprints and implementation mechanics.
Module 16: Governance, ADRs, & The Architecture Review Board (ARB)
PREREQUISITE STATEMENT: Read this module after completing Module 15 (Environmental Assessment). Selecting the technical patterns for Greenfield or Brownfield migrations is an engineering task; driving team alignment, managing cloud budgets, and documenting decisions for future engineers is a governance task. This module teaches you how to establish architectural standards within an enterprise organization.
1. Introduction: What is Architectural Governance?
In a small startup, system alignment is simple. A few engineers can agree on databases and libraries over a lunch conversation. In a large enterprise with hundreds of developers distributed across multiple autonomous teams, this informal alignment breaks down. Without structured governance, the system decays:
- Technology Sprawl: Team A uses Node.js/Postgres, Team B uses Java/Oracle, Team C uses Python/MongoDB, and Team D uses Rust/DynamoDB. The organization loses the ability to share code, move engineers between teams, or negotiate bulk licensing agreements.
- Compliance Breaches: Individual developers might inadvertently write data to non-compliant storage regions, violating laws like GDPR, HIPAA, or PCI-DSS.
- Technical Debt Accumulation: Short-term product deadlines lead to hacks that compromise core structural patterns (e.g., direct cross-database queries), creating tight coupling that blocks release cycles.
- FinOps / Cloud Cost Sprawl: Without cost boundaries, teams deploy expensive, over-provisioned cluster configurations, causing infrastructure billing to escalate.
Architectural governance is the process of aligning software engineering choices with corporate strategies, security policies, cost constraints, and long-term tech stack standards.
2. Architectural Decision Records (ADRs)
A major challenge in software engineering is understanding why a decision was made. Years after a system is built, a new engineer might look at an unusual routing structure or database configuration and assume it was a mistake, only to break the system by refactoring it because they did not understand the original constraints.
To prevent this, architects use Architectural Decision Records (ADRs), a concept popularized by Michael Nygard.
[ The Git-Based ADR Workflow ]
[ Code Change ] + [ ADR Markdown File in /docs/adr/ ]
|
[ Git Pull Request ] ---> [ Code Review & Governance Sign-off ]
|
[ Merge to Main ] ---> [ Permanent Architectural Ledger ]
- Locality of Documentation: ADRs are not stored in external wikis or intranets (which quickly become out-of-date and are disconnected from code). Instead, ADRs are stored directly in the source control repository (typically under
/docs/adr/) as lightweight Markdown files. - The PR Lifecycle: Every major architectural change requires a corresponding ADR file. The ADR is committed in the same Git branch as the code changes, allowing the team to review the architectural rationale during the Pull Request code review process.
- The Permanent Ledger: Once merged to
main, the directory serves as a chronological, searchable ledger documenting the evolution of the system architecture.
Production-Grade ADR Example: ADR-004-Event-Driven-Checkout.md
Below is an enterprise-grade ADR documenting the decision to migrate from synchronous HTTP to asynchronous event streaming for a checkout system:
# ADR-004: Asynchronous Checkout Event Streaming
## Status
Accepted
## Context
Our current e-commerce checkout flow uses synchronous HTTP POST calls between the Order Service, the Inventory Service, and the Billing Service.
Under peak traffic events (such as marketing campaigns), the Billing Service experiences thread saturation, causing checkout requests to block and time out.
This temporal coupling reduces checkout availability to the product of all three downstream service availabilities, violating our target P99 latency SLA of < 200ms.
We considered three alternatives:
1. **Scaling up the Billing Service instances (Vertical/Horizontal):** High compute cost, and does not solve the availability coupling problem if the billing database experiences lock contention.
2. **Implementing HTTP retry loops with jitter:** Increases client wait times, violating our latency SLA.
3. **Migrating to Asynchronous Event Streaming (Chosen):** Decouples services temporally.
## Decision
We will replace the synchronous HTTP calls from the Order Service to the Billing and Inventory Services with an asynchronous event-driven model using Apache Kafka.
* The Order Service will commit checkout details to its local database and immediately publish an `OrderPlaced` event to a partitioned Kafka topic (`orders.v1`).
* The Billing and Inventory services will run as independent consumers subscribing to the `orders.v1` topic, processing events asynchronously.
* We will implement the Transactional Outbox pattern in the Order Service to guarantee at-least-once delivery without dual-write failure states.
## Consequences
* **Positive:** The checkout API response time will decrease from > 800ms to < 50ms, as it only requires a local write and database transaction commit.
* **Positive:** Downstream outages in the Billing Service will no longer block checkout queries; events will buffer in Kafka until the billing nodes recover.
* **Negative:** Introduces eventual consistency. The client UI will receive an immediate checkout success status, but inventory reservation and payment processing will complete asynchronously. The frontend must handle payment failure notifications via WebSockets or email alerts.
* **Negative:** Increased infrastructure complexity; requires maintaining a Kafka cluster and monitoring consumer group offsets.
3. The Architecture Review Board (ARB) & FinOps
A. The ARB Process
The Architecture Review Board (ARB) is a steering committee composed of senior architects, security officers, and engineering managers:
- The Goal: The ARB reviews proposed ADRs for major system alterations (e.g., adding a new database engine, altering authentication protocols, migrating to a new cloud provider).
- The Presentation: The proposing engineer submits the ADR to the board. The board evaluates the design against corporate standards, security policies, and total cost of ownership (TCO) constraints.
- The Resolution: The ARB either accepts the ADR, rejects it with specific feedback, or requests modifications, ensuring all system changes align with global engineering standards.
B. FinOps: Cost as an Architectural Metric
A modern architect must treat cloud infrastructure costs as a primary engineering constraint:
- Compute Costs: Choosing between Serverless (AWS Lambda, Google Cloud Run) and Container Orchestration (Kubernetes). Serverless is cheap for low, bursty workloads, but becomes highly expensive for constant, high-throughput systems where dedicated container nodes are more cost-effective.
- Storage Egress: Designing systems to avoid transferring raw datasets across geographic regions, which generates high network egress bills.
- The FinOps Ledger: Every architecture proposal must include a cost estimation showing the monthly infrastructure bill based on projected user scale.
4. Documentation Standard: Enterprise ARB Agenda
Below is a template for documenting an ARB Review Agenda & Architecture Decisions Log:
ARB Review Ledger
| ADR ID | Proposal Title | Submitting Team | Sponsoring Architect | ARB Status | Long-term Consequences | FinOps Cost Impact |
|---|---|---|---|---|---|---|
| ADR-004 | Asynchronous Checkout Event Streaming | Checkout Platform Team | Jane Doe (Principal Architect) | Accepted | Decouples checkout latency; introduces eventual consistency complexity. | + $150/month (Kafka Cluster resources) |
| ADR-005 | Adopt DynamoDB for User Session Cache | Identity Security Team | John Smith (Lead Architect) | Accepted | Enables fast session retrieval; requires strict TTL configuration. | - $80/month (Replaces over-provisioned ElastiCache nodes) |
| ADR-006 | Shared Database for Inventory and Orders | Inventory Team | Bob Johnson (Senior Engineer) | Rejected | Violates database-per-service microservice boundaries; introduces tight schema coupling. | N/A |
5. Hands-on Architecture Challenge
Scenario Description
You are designing the workflow governance for your engineering team. You must model the complete ADR Lifecycle Workflow from draft to final execution or deprecation.
Your Goal:
- Model the ADR States using flowchart layout shapes:
Proposed(Drafting state).Review(ARB review committee).Accepted(Decision approved and active).Rejected(Decision declined).Superseded(Decision replaced by a newer ADR).
- Define the Workflow Transitions:
- From
ProposedtoReview. - From
ReviewtoAccepted(if approved). - From
ReviewtoRejected(if declined). - From
Rejectedback toProposed(if refactored for changes). - From
AcceptedtoSuperseded(when a new ADR overrides it).
- From
- Draw this governance lifecycle using the diagram editor's graph syntax.
6. Practice Challenge Template
Use this template in your sandbox to model the ADR governance lifecycle:
graph TD
Proposed[Proposed / Draft State] -->|Submit for Review| Review[ARB Review Phase]
Review -->|Approved by Committee| Accepted[Accepted / Active Decision]
Review -->|Declined by Committee| Rejected[Rejected / Needs Revision]
Rejected -->|Refactor & Resubmit| Proposed
Accepted -->|Overridden by New ADR| Superseded[Superseded / Deprecated]
style Proposed fill:#9ff,stroke:#333,stroke-width:2px
style Review fill:#9ff,stroke:#333,stroke-width:2px
style Accepted fill:#9f9,stroke:#333,stroke-width:3px
style Rejected fill:#f99,stroke:#333,stroke-width:2px
style Superseded fill:#ccc,stroke:#333,stroke-width:2px
NEXT MODULE BRIDGE: Once your engineering teams are aligned on architectural governance and ADR processes, you must validate these skills in a production-level environment. Proceed to Module 17: The Capstone Architecture Proving Ground to test your system design capabilities against three real-world scenario challenges.