Architecture Review
Purpose: Evaluate and decide on a technical approach before significant implementation begins
How to run this meeting
Use Architecture Decision Record (ADR) structure: document the context, the options considered, the tradeoffs, and the final decision — in writing, before the meeting. The review meeting is for discussion and decision, not for presenting information cold. Anyone coming to the meeting without having read the ADR proposal is not ready to participate.
Require at least two alternatives to be presented alongside the proposed approach. "We considered X and Y but chose Z" is a decision. "We're doing Z" is an assumption. Alternatives don't need to be fully fleshed out, but they need to be seriously considered — not strawmen set up to fail. If the presenter can't articulate why the alternatives were rejected, the alternatives weren't actually evaluated.
Be explicit about what you're optimizing for. Tradeoffs only make sense in context. "This option is slower but simpler" is meaningless unless you've stated whether you're optimizing for performance, developer velocity, operational simplicity, or cost. State the optimization target at the top of the document and hold all tradeoff discussions against it. End the meeting with a clear, recorded decision — or with a documented list of blockers preventing a decision.
Before the meeting
- Author writes and circulates the ADR document at least 48 hours before the meeting
- ADR includes: context, at least 2 alternatives, tradeoffs, and a proposed decision
- Reviewers read the ADR and submit written questions or concerns before the meeting
- Identify the decision-maker (usually the engineering lead or architect) before the meeting starts
- Confirm what "done" looks like: are we deciding today, or are there open blockers?
Meeting Details
- Date:
- Facilitator:
- Attendees:
- Duration: 60 minutes
Problem Description
State the technical problem being solved. Include system context, scale requirements, and any constraints that narrow the solution space. This is the "why we need to make a decision" section.
The Scheduled Reports feature requires a job scheduling system that can enqueue report generation jobs at specific times across multiple timezones, with configurable recurrence (daily/weekly/monthly). Our current Sidekiq setup uses perform_in for one-off delays but has no native cron support. As we scale to support thousands of scheduled reports, we need a reliable, observable scheduling layer that survives restarts and supports per-job timezone handling.
Scale target: 10,000 active schedules within 12 months of launch Constraints: Must run on our existing AWS infrastructure; no new managed services without security review
Proposed Architecture
Describe the recommended approach. Include diagrams or system sketches where helpful (link to Miro, Excalidraw, etc.). Explain why this is the proposed option, not just what it is.
Option A (Proposed): Sidekiq-Cron with a scheduled_reports database table
Add sidekiq-cron gem. Store each user-created schedule as a row in a scheduled_reports table with fields for cadence, timezone, next_run_at, and last_run_at. A nightly maintenance job recomputes next_run_at values to handle DST transitions. Sidekiq-Cron enqueues the actual report generation jobs at the right time by polling the table.
Why proposed: Sidekiq-Cron is already used by 3 adjacent teams, we have operational experience with it, and it avoids introducing a new infrastructure component (e.g., a separate scheduler service or AWS EventBridge).
Alternatives Considered
Describe each alternative seriously. Include enough detail that a reader can understand why it was or wasn't chosen.
Option B: AWS EventBridge Scheduler Fully managed, supports cron expressions natively, handles timezone scheduling. Downside: requires new IAM permissions and security review (4–6 week timeline), introduces an external dependency that would be hard to test locally, and EventBridge pricing becomes meaningful at 10K+ schedules with high invocation frequency.
Option C: Custom scheduler microservice Build a standalone service that owns scheduling logic. Maximum flexibility and scalability. Downside: significant build cost (estimated 3–4 sprints), introduces a new service to operate and monitor, and is disproportionate to our current scale needs. Revisit if we reach 100K+ schedules.
Tradeoffs
Make the tradeoffs explicit against your stated optimization target. What are you gaining and giving up with the proposed approach?
Optimizing for: Developer velocity and operational simplicity at current scale (< 10K schedules)
| Option A (Proposed) | Option B (EventBridge) | Option C (Microservice) | |
|---|---|---|---|
| Implementation time | ~1 sprint | ~2 sprints (incl. security review) | ~4 sprints |
| Operational complexity | Low (familiar stack) | Medium (new AWS service) | High (new service) |
| Scale ceiling | ~50K schedules | Effectively unlimited | Effectively unlimited |
| Local dev/test | Easy | Hard (requires mocks) | Medium |
| DST handling | Manual (maintenance job) | Native | Custom |
Risks
List the risks of the proposed approach and how each will be mitigated.
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| sidekiq-cron doesn't scale to 50K schedules | Low | High | Load test at 10K before launch; migration path to Option B documented |
| DST maintenance job has a bug causing missed or doubled sends | Medium | Medium | Unit test all timezone edge cases; add monitoring alert for jobs not running within ±5 min of scheduled time |
| next_run_at computation errors after daylight saving transition | Medium | Medium | Test with IANA timezone strings across all US/EU zones; run a DST dry-run in staging |
Decision
Record the final decision, who made it, and the date. If no decision was reached, document the blockers.
Decision: Proceed with Option A (Sidekiq-Cron + scheduled_reports table) Decision maker: @elena (Engineering Lead) Date: 2024-12-05 Rationale: Acceptable risk at current scale, fastest path to shipping, team has existing operational experience. Revisit Option B if we exceed 30K active schedules.
Action Items
| Owner | Action | Due Date | Status |
|---|---|---|---|
| @backend | Write load test plan for 10K schedule simulation | 2024-12-12 | Open |
| @backend | Document migration path to EventBridge as a future option | 2024-12-12 | Open |
| @elena | Update ADR document with final decision and post to architecture wiki | 2024-12-06 | Open |
| @infra | Add monitoring alert for jobs running >5 min outside scheduled time | 2024-12-19 | Open |
Follow-up
The author updates the ADR document with the final decision and posts it to the team's architecture decision log. Link the ADR from the relevant feature tracking issue. If the decision reverses or is revisited, update the ADR with the new context — don't delete the original reasoning. Share a summary with the broader engineering team in the weekly eng sync.
Skip the template
Let Stoa capture it automatically.
In Stoa, the AI agent listens to your architecture review and captures decisions, drafts artifacts, and tracks open questions in real time — no note-taking required.
Create your first Space — free