Skip to main content
Product Discovery & Design|Architecture Review
Product Discovery & Design

Architecture Review

Evaluate technical design.

Architecture Review

Purpose: Evaluate and decide on a technical approach before significant implementation begins

How to run this meeting

Use Architecture Decision Record (ADR) structure: document the context, the options considered, the tradeoffs, and the final decision — in writing, before the meeting. The review meeting is for discussion and decision, not for presenting information cold. Anyone coming to the meeting without having read the ADR proposal is not ready to participate.

Require at least two alternatives to be presented alongside the proposed approach. "We considered X and Y but chose Z" is a decision. "We're doing Z" is an assumption. Alternatives don't need to be fully fleshed out, but they need to be seriously considered — not strawmen set up to fail. If the presenter can't articulate why the alternatives were rejected, the alternatives weren't actually evaluated.

Be explicit about what you're optimizing for. Tradeoffs only make sense in context. "This option is slower but simpler" is meaningless unless you've stated whether you're optimizing for performance, developer velocity, operational simplicity, or cost. State the optimization target at the top of the document and hold all tradeoff discussions against it. End the meeting with a clear, recorded decision — or with a documented list of blockers preventing a decision.

Before the meeting

  • Author writes and circulates the ADR document at least 48 hours before the meeting
  • ADR includes: context, at least 2 alternatives, tradeoffs, and a proposed decision
  • Reviewers read the ADR and submit written questions or concerns before the meeting
  • Identify the decision-maker (usually the engineering lead or architect) before the meeting starts
  • Confirm what "done" looks like: are we deciding today, or are there open blockers?

Meeting Details

  • Date:
  • Facilitator:
  • Attendees:
  • Duration: 60 minutes

Problem Description

State the technical problem being solved. Include system context, scale requirements, and any constraints that narrow the solution space. This is the "why we need to make a decision" section.

The Scheduled Reports feature requires a job scheduling system that can enqueue report generation jobs at specific times across multiple timezones, with configurable recurrence (daily/weekly/monthly). Our current Sidekiq setup uses perform_in for one-off delays but has no native cron support. As we scale to support thousands of scheduled reports, we need a reliable, observable scheduling layer that survives restarts and supports per-job timezone handling.

Scale target: 10,000 active schedules within 12 months of launch Constraints: Must run on our existing AWS infrastructure; no new managed services without security review


Proposed Architecture

Describe the recommended approach. Include diagrams or system sketches where helpful (link to Miro, Excalidraw, etc.). Explain why this is the proposed option, not just what it is.

Option A (Proposed): Sidekiq-Cron with a scheduled_reports database table

Add sidekiq-cron gem. Store each user-created schedule as a row in a scheduled_reports table with fields for cadence, timezone, next_run_at, and last_run_at. A nightly maintenance job recomputes next_run_at values to handle DST transitions. Sidekiq-Cron enqueues the actual report generation jobs at the right time by polling the table.

Architecture diagram

Why proposed: Sidekiq-Cron is already used by 3 adjacent teams, we have operational experience with it, and it avoids introducing a new infrastructure component (e.g., a separate scheduler service or AWS EventBridge).


Alternatives Considered

Describe each alternative seriously. Include enough detail that a reader can understand why it was or wasn't chosen.

Option B: AWS EventBridge Scheduler Fully managed, supports cron expressions natively, handles timezone scheduling. Downside: requires new IAM permissions and security review (4–6 week timeline), introduces an external dependency that would be hard to test locally, and EventBridge pricing becomes meaningful at 10K+ schedules with high invocation frequency.

Option C: Custom scheduler microservice Build a standalone service that owns scheduling logic. Maximum flexibility and scalability. Downside: significant build cost (estimated 3–4 sprints), introduces a new service to operate and monitor, and is disproportionate to our current scale needs. Revisit if we reach 100K+ schedules.


Tradeoffs

Make the tradeoffs explicit against your stated optimization target. What are you gaining and giving up with the proposed approach?

Optimizing for: Developer velocity and operational simplicity at current scale (< 10K schedules)

Option A (Proposed)Option B (EventBridge)Option C (Microservice)
Implementation time~1 sprint~2 sprints (incl. security review)~4 sprints
Operational complexityLow (familiar stack)Medium (new AWS service)High (new service)
Scale ceiling~50K schedulesEffectively unlimitedEffectively unlimited
Local dev/testEasyHard (requires mocks)Medium
DST handlingManual (maintenance job)NativeCustom

Risks

List the risks of the proposed approach and how each will be mitigated.

RiskLikelihoodImpactMitigation
sidekiq-cron doesn't scale to 50K schedulesLowHighLoad test at 10K before launch; migration path to Option B documented
DST maintenance job has a bug causing missed or doubled sendsMediumMediumUnit test all timezone edge cases; add monitoring alert for jobs not running within ±5 min of scheduled time
next_run_at computation errors after daylight saving transitionMediumMediumTest with IANA timezone strings across all US/EU zones; run a DST dry-run in staging

Decision

Record the final decision, who made it, and the date. If no decision was reached, document the blockers.

Decision: Proceed with Option A (Sidekiq-Cron + scheduled_reports table) Decision maker: @elena (Engineering Lead) Date: 2024-12-05 Rationale: Acceptable risk at current scale, fastest path to shipping, team has existing operational experience. Revisit Option B if we exceed 30K active schedules.


Action Items

OwnerActionDue DateStatus
@backendWrite load test plan for 10K schedule simulation2024-12-12Open
@backendDocument migration path to EventBridge as a future option2024-12-12Open
@elenaUpdate ADR document with final decision and post to architecture wiki2024-12-06Open
@infraAdd monitoring alert for jobs running >5 min outside scheduled time2024-12-19Open

Follow-up

The author updates the ADR document with the final decision and posts it to the team's architecture decision log. Link the ADR from the relevant feature tracking issue. If the decision reverses or is revisited, update the ADR with the new context — don't delete the original reasoning. Share a summary with the broader engineering team in the weekly eng sync.

Skip the template

Let Stoa capture it automatically.

In Stoa, the AI agent listens to your architecture review and captures decisions, drafts artifacts, and tracks open questions in real time — no note-taking required.

Create your first Space — free