Feature Flag Platform Comparison 2026: An Honest Self-Audit
Why We're Publishing Our Own Product Audit
Most SaaS companies publish comparison pages that make them look perfect. We're doing something different: a brutally honest audit of our own platform, scored by the same criteria we'd use to evaluate any feature flag tool.
We built Rollgate because we were tired of paying five figures for a configuration service. But being cheaper isn't enough — you need to be good. So we asked ourselves: if a senior product owner with 10 years of experience building feature flag platforms evaluated Rollgate today, what would they say?
Here's the result. Every feature area scored 1-5, compared against LaunchDarkly, Flagsmith, ConfigCat, and GrowthBook. No cherry-picking, no spin.
Core Feature Flags — 4.5/5
What we built: Four flag types (boolean, string, number, JSON), four categories (release, experiment, kill switch, ops), full lifecycle management with scheduling, 1-click rollback, history, tags, and multi-environment support.
What's strong: Scheduled changes are available on every paid plan. LaunchDarkly restricts scheduling to Enterprise tier. Our rollback system stores the last 20 states per flag — one click restores the exact previous configuration including targeting rules, percentage, and variations.
What's missing: Flag dependencies (enable flag B only if flag A is enabled). Stale flag detection (warn when a flag has been at 100% for weeks). Approval workflows for critical flags. These are enterprise features that matter at scale but aren't critical for our current target market.
vs competitors: On par with Flagsmith and ConfigCat for core flag management. Below LaunchDarkly for enterprise workflows (approval gates, change requests). Ahead of GrowthBook, which treats flags as secondary to experimentation. For a detailed comparison, see our LaunchDarkly alternative guide.
Targeting & Segmentation — 4/5
What we built: 18 targeting operators including semver comparison, regex, numeric ranges, and list membership. Two segment types: rule-based (define conditions) and list-based (explicit user lists). Segments are reusable across flags. Percentage rollouts use consistent hashing so the same user always sees the same variant.
The 18 operators:
equals, not_equals, contains, not_contains, starts_with, ends_with, in, not_in, gt, gte, lt, lte, regex, is_set, is_not_set, semver_gt, semver_lt, semver_eq
What's strong: Semver targeting is rare — most competitors don't offer it. This matters for mobile teams doing version-based rollouts. Rule-based segments with reuse across flags reduce duplication.
What's missing: Segment match count preview ("this rule matches ~2,340 users" before saving). CSV import for list-based segments. Nested rule groups (currently AND/OR, but not nested groups of groups).
vs competitors: On par with ConfigCat and Flagsmith. LaunchDarkly has more sophisticated targeting with prerequisites and custom attributes indexing. GrowthBook has strong experimentation targeting but simpler flag targeting.
A/B Testing & Experimentation — 3/5
This is our most honest score. A/B testing works, but it's not yet where it needs to be for data-driven product teams.
What we built: Full experiment lifecycle (draft → running → paused → ended). Per-variation statistics with exposures, conversions, and conversion rates. Z-test statistical significance with confidence intervals. Lift calculation. Winner declaration with optional auto-rollout to the winning variation. Event tracking integrated into all 13 SDKs.
What's strong: Auto-rollout of the winning variation is a genuine time-saver — declare a winner and the flag automatically updates to serve the winning variant to 100% of users. Event tracking is built into every SDK, not bolted on.
What's missing: Mutual exclusion between experiments (critical when running overlapping tests). Sample size calculator (users don't know how much traffic they need for statistical significance). Guardrail metrics (automatically stop an experiment if a critical metric degrades). Bayesian statistics as an alternative to frequentist testing. Experiment report export.
These gaps matter. Without mutual exclusion, overlapping experiments can produce invalid results. Without sample size estimation, teams either run experiments too short (inconclusive) or too long (wasted time).
vs competitors: Below GrowthBook, which is experimentation-first and offers Bayesian stats, sequential testing, and warehouse integration. On par with Flagsmith (which has no native A/B testing). Below LaunchDarkly's Experimentation add-on. Above ConfigCat, which has zero A/B capability.
What we're building next: Mutual exclusion layers and a sample size calculator are our top experimentation priorities.
Real-Time & Performance — 5/5
What we built: Server-Sent Events (SSE) streaming across 12 of 13 SDKs (Flutter uses polling). Redis caching with 60-second TTL. ETag and Last-Modified conditional requests. Cache invalidation on flag changes. Bulk flag fetching. Per-plan SSE connection limits. Local evaluation mode — server SDKs download the full ruleset via SSE and evaluate flags locally without network round-trips.
What's strong: Local evaluation is the same architecture LaunchDarkly uses with their Relay Proxy, but we built it directly into the SDK. Flag evaluation happens in-memory with sub-microsecond latency. The API responds in 2-3ms server-side. Circuit breakers in every SDK ensure your application never fails because the flag service is down.
Every SDK ships with:
- Circuit breaker (CLOSED → OPEN → HALF_OPEN state machine)
- Exponential backoff with jitter
- Request deduplication
- Stale-while-revalidate caching
- ETag support for bandwidth efficiency
- Distributed tracing (W3C Trace Context)
What's missing: Flutter SSE (currently polling-only). Edge evaluation via CDN workers.
vs competitors: On par with LaunchDarkly (SSE + local eval). Ahead of ConfigCat (polling only, no local eval). Ahead of GrowthBook (local eval but no SSE). On par with Flagsmith (SSE support). The resilience features (circuit breaker, retry, tracing in every SDK) are a differentiator — most competitors add these as optional plugins, not defaults. For implementation details, see our gradual rollouts guide.
SDK Experience — 4.5/5
What we built: 13 SDKs, all published and production-ready:
| SDK | Registry | Version | SSE | Tests |
|---|---|---|---|---|
| Core (shared) | npm | 1.2.3 | — | 4 |
| Node.js | npm | 1.2.3 | Yes | 5+ |
| Browser | npm | 1.2.3 | Yes | 1 |
| React | npm | 1.2.3 | Yes | 4 |
| Vue | npm | 1.2.3 | Yes | 2 |
| Angular | npm | 1.2.3 | Yes | 1 |
| Svelte | npm | 1.2.3 | Yes | 2 |
| React Native | npm | 1.2.3 | No | 0* |
| Go | Go modules | 1.1.0 | Yes | 8 |
| Python | PyPI | 1.2.3 | Yes | 8 |
| Java | Maven | 1.2.3 | Yes | 7 |
| .NET | NuGet | 1.2.3 | Yes | 0* |
| Flutter | pub.dev | 1.2.3 | No | 0* |
*These SDKs use cross-SDK contract tests (102 tests) instead of individual unit tests.
Architecture: All TypeScript SDKs share a common sdk-core package that implements caching, circuit breaker, retry, metrics, events, and tracing. Framework SDKs (React, Vue, Angular, Svelte) wrap sdk-browser, which wraps sdk-core. Native SDKs (Go, Java, Python, .NET, Flutter) implement the same patterns independently.
What's strong: 13 SDKs is more than GrowthBook (8) and on par with ConfigCat (12). Every SDK — not just the popular ones — ships with circuit breaker, retry, caching, event tracking, and distributed tracing. The consistent API surface means switching languages doesn't mean learning a new SDK.
What's missing: OpenFeature provider (the emerging vendor-neutral standard for feature flags). Offline-first mode for mobile SDKs.
vs competitors: Below LaunchDarkly (25+ SDKs) but with better feature parity across SDKs. On par with Flagsmith (18 SDKs). Ahead of GrowthBook (8). On par with ConfigCat (12). See our SDK tutorials for React, Next.js, Go, and Python.
Dashboard UX — 4/5
What we built: Complete flag management with search, filter, sort, and bulk actions. Six-step onboarding checklist that auto-tracks progress. Command palette (Cmd+K) for quick navigation. Drag-and-drop rules editor. Audit log with entity and action filtering. Usage monitoring with progress bars against plan limits. Dark mode. Fully responsive for mobile. Keyboard shortcuts.
What's strong: The onboarding checklist is better than what LaunchDarkly, Flagsmith, or ConfigCat offer for new users. It tracks actual progress (did you create a project? create a flag? make an SDK call?) rather than just showing a tutorial. The command palette is a power-user feature that none of our competitors have.
What's missing: Rule simulation ("if I save this rule, would user X see the flag?"). Environment diff view (compare flag state between staging and production side by side). Collaborative editing awareness (two people editing the same flag don't see each other's changes). In-app help tooltips on complex features.
vs competitors: On par with or ahead of ConfigCat and Flagsmith for UX quality. Below LaunchDarkly for enterprise features (approval workflows, change requests, scheduled approvals).
Billing & Packaging — 4.5/5
What we built: Paddle as Merchant of Record (automatic VAT/tax handling). Four plans (Free, Starter €39-45/mo, Pro €99-119/mo, Growth €299-349/mo) plus custom Enterprise. Monthly and annual billing with 15-18% annual discount. Full usage tracking with monthly history. Plan limit enforcement with fail-open design. Complete checkout, cancel, resume, and plan change flows.
| Free | Starter | Pro | Growth | |
|---|---|---|---|---|
| SDK requests/mo | 500K | 1M | 3M | 12M |
| SSE connections | 1 | 25 | 100 | 500 |
| Projects | 3 | 5 | 10 | Unlimited |
| Team members | 3 | 5 | 15 | 50 |
| Flags | Unlimited | Unlimited | Unlimited | Unlimited |
| Scheduled changes | — | Yes | Yes | Yes |
| Rollback | — | Yes | Yes | Yes |
| Audit log | 3 days | 14 days | 90 days | 1 year |
What's strong: The free tier (500K requests/month, unlimited flags) is the most generous in the market. ConfigCat limits you to 10 flags on free. The fail-open design means SDK calls never hard-fail when you exceed limits — they return cached values and the API returns proper rate limit headers. Pricing is transparent and public, unlike LaunchDarkly's "contact sales" model.
What's missing: Free trial for paid plans (14-day Pro trial would help conversion). Proactive email alerts when approaching usage limits.
vs competitors: Pricing is our strongest competitive advantage against LaunchDarkly. A team at 100K MAU pays ~€99/month on Rollgate vs ~$4,000/month on LaunchDarkly. That's a 40x difference. See our detailed pricing comparison and ConfigCat comparison.
Operational — 3.5/5
What we built: Seven webhook event types with HMAC signing, delivery tracking, and automatic retry. Partial OpenAPI specification (~40 of 114 endpoints). Redis-based rate limiting with sliding window. Complete audit log with IP tracking. Health, readiness, and liveness endpoints. Prometheus metrics endpoint.
What's strong: The webhook system is production-grade — HMAC signing prevents spoofing, delivery history with response codes aids debugging, and retry with backoff handles transient failures.
What's missing: Complete OpenAPI spec (only 35% of endpoints documented). CLI tool for flag management from terminal. Terraform/Pulumi provider for infrastructure-as-code workflows. Native Slack/Teams notifications. Data export (flag configurations, audit log).
vs competitors: Below LaunchDarkly and Flagsmith (both have CLI tools and Terraform providers). On par with ConfigCat. The incomplete OpenAPI spec is our biggest operational gap — developers evaluating the API want complete, interactive documentation.
The Scorecard
| Area | Score | vs LD | vs Flagsmith | vs ConfigCat | vs GrowthBook |
|---|---|---|---|---|---|
| Core Flags | 4.5/5 | Below | Even | Even | Above |
| Targeting | 4/5 | Below | Even | Even | Even |
| A/B Testing | 3/5 | Below | Above | Above | Below |
| Real-Time | 5/5 | Even | Even | Above | Above |
| SDKs | 4.5/5 | Below* | Even | Even | Above |
| Dashboard | 4/5 | Below | Even | Above | Above |
| Billing | 4.5/5 | Above | Even | Even | Even |
| Operational | 3.5/5 | Below | Below | Even | Even |
*LaunchDarkly has more SDKs (25+) but Rollgate has better feature parity across all 13.
Overall: 4.1/5
What We're Building Next
Based on this audit, here are our priorities ordered by impact:
High Priority
- Free trial for Pro plan — 14 days, no credit card. Let users experience scheduled changes and rollback before committing.
- Mutual exclusion for A/B tests — Without this, overlapping experiments produce invalid data.
- Sample size calculator — Show users how much traffic they need before starting an experiment.
- Complete OpenAPI spec — Cover all 114 endpoints with interactive documentation.
Medium Priority
- Rule simulation — "Test this targeting rule against user X" before saving.
- CLI tool — Manage flags from terminal, integrate with CI/CD pipelines.
- Slack notifications — Alert channels when flags change.
- Environment diff view — Compare flag state across environments side by side.
Quick Wins (In Progress)
- Stale flag detection banner
- Usage warning indicators
- Contextual upgrade prompts
- Environment diff badges
FAQ
Is Rollgate a good alternative to LaunchDarkly?
For teams under 50 developers, yes. Rollgate covers the same core use cases (feature flags, targeting, rollouts, kill switches) at a fraction of the cost. The main gaps are enterprise features like approval workflows and SSO SAML. See our detailed LaunchDarkly comparison.
How does Rollgate compare to open-source tools like Flagsmith and GrowthBook?
Rollgate scores higher on real-time infrastructure (SSE + local evaluation) and SDK resilience (circuit breakers in every SDK). Flagsmith has more SDKs (18 vs 13) and a self-hosted option. GrowthBook has stronger A/B testing with Bayesian statistics. Rollgate sits between them — stronger than Flagsmith on experimentation, stronger than GrowthBook on feature flag management.
What are Rollgate's biggest weaknesses?
A/B testing (3/5) and operational tooling (3.5/5). Specifically: no mutual exclusion between experiments, no sample size calculator, incomplete OpenAPI documentation, and no CLI tool. We're actively building all of these.
Is Rollgate production-ready?
Yes. 13 published SDKs, all with circuit breakers and graceful degradation. The API runs on Hetzner in Germany (EU data residency) with PostgreSQL and Redis. The fail-open design means your application continues working even if the Rollgate API is temporarily unreachable.
How much does Rollgate cost compared to competitors?
A team with 100K monthly active users pays ~€99/month on Rollgate vs ~$4,000/month on LaunchDarkly. That's a 40x difference. See the full pricing breakdown.
The Honest Summary
Rollgate is a solid 4/5 platform that covers the core feature flag use case better than most alternatives, at a fraction of the cost. Our real-time infrastructure and SDK resilience are best-in-class. Our A/B testing and operational tooling need work.
If you're a team of 2-50 developers who need feature flags, gradual rollouts, and basic experimentation without spending $25K+/year, Rollgate is built for you. If you need enterprise approval workflows, a 25-SDK ecosystem, or advanced statistical experimentation, LaunchDarkly or GrowthBook might be better fits today.
We're shipping improvements every week. This audit will be updated quarterly — follow our changelog or join the Discord to stay in the loop.
Try Rollgate free — 500K requests/month, all 13 SDKs, no credit card required.