So many new features,
so little time

Building an intent-aware recommendation engine across 200+ AWS services

Amazon

2025 - present

One of the projects I'm most excited about inside the EC2 platform team is something we're calling the Unified Feature Intelligence Layer, or UFIL.

It started with a familiar problem.

AWS customers are drowning in notifications. Flashbars, banners, "What's New" tags, custom alert systems built independently by hundreds of service teams.

Despite all the notifications, customers still can't find features that would genuinely help them. We kept hearing the same thing in our research: "Hmm, I didn't know AWS had that." As my high school Spanish teacher used to say– that's no bueno.

If there's one core insight that has shaped UFIL, it's this: we're not surfacing too many new features, we're surfacing the wrong ones, at the wrong time, in the wrong way, with no coherent signal underneath any of it.

Each service team is independently deciding when and how to interrupt users. There's no shared intelligence. The result is banner blindness, and once a user starts ignoring your recommendations, you've essentially lost the channel.

So what does UFIL actually do?

At its foundation, UFIL introduces a standardized metadata schema that every AWS service team publishes for their features. Instead of each team hardcoding their own notification logic, they describe their feature declaratively. Things like target_personas, prerequisites, intent_category, confidence_threshold, suppression_rules, and a cost_fingerprint that captures estimated savings. That metadata schema is the backbone of everything else.

On top of that, UFIL runs a lightweight intent detection engine that reads real-time behavioral signals from the console, what resources a user has open, what actions they've just taken, whether they're in an error state, whether a cost anomaly has been detected.

From those signals, the system infers a user's intent in the moment: are they troubleshooting? Launching something new? Cost-optimizing? Scaling? The engine classifies that intent and uses it to filter the full catalog of available features down to the one or two that are actually relevant right now.

That "one or two" part is intentional and non-negotiable. UFIL enforces a single recommendation slot. Not a feed, not a panel, just one thing. And it only surfaces if the confidence score clears a threshold (≥0.85 in our current spec). Below that line, nothing appears.

UFIL in action

One example of how UFIL can help our customers is a common scenario where a customer launches a new EC2 instance, scrolls down to configure storage, and leaves the default volume type set to gp2 which is both more expensive and has worse baseline performance than our newer gp3 version.

Today, AWS shows nothing. No hint, no flag, no suggestion. The customer overpays, often indefinitely, without ever knowing there was a better option sitting right next to it.

With UFIL, the EC2 console fires a context event the moment the user selects gp2 in the Configure Storage panel:

aws.discovery.registerContext({
  resourceType: "ec2.instance.launch",
  resourceCount: 1,
  intent: "configure_storage",
  action: "select_volume_type",
  selections: {
    volumeType: "gp2",
    volumeSizeGiB: 100,
    instanceFamily: "m6i",
    region: "us-west-2"
  }
});

That event tells UFIL something important: this user isn't browsing. They're actively configuring storage and have explicitly selected gp2. That's a meaningful behavioral signal, and it's the kind of precision that separates UFIL from a generic notification system.

Meanwhile, the EC2 service team has already published standardized metadata for the gp3 upgrade recommendation:

aws.discovery.registerFeature({
  feature_id: "ec2.ebs.gp3_upgrade_suggestion",
  service: "ec2",
  version: "2025-01-15",

  target_personas: [
    "cost_optimized_user",
    "new_ec2_user",
    "ops_engineer",
    "general_builder"
  ],

  prerequisites: {
    requires_volume_type: ["gp2"],
    max_volume_size_gib: 16000,
    forbidden_volume_types: ["io1", "io2", "sc1", "st1"],
    required_permissions: ["ec2:ModifyVolume"]
  },

  recommended_timing: [
    "when_user_selects_gp2",
    "when_user_expands_configure_storage",
    "before_launch_review"
  ],

  intent_category: ["cost_optimization", "performance_equivalent_choice"],

  confidence_threshold: 0.82, // only show when very confident

  suppression_rules: {
    permanent_if_dismissed: true,
    snooze_period_days: 30,
    hide_if_customer_prefers: ["gp2_for_snapshot_seeding"]
  },

  cost_fingerprint: {
    gp2_price_per_gb: 0.10,
    gp3_price_per_gb: 0.08,
    estimated_monthly_savings_range: "$2–$6"
  },

  retrigger_conditions: [
    "user_reselects_gp2_after_snooze",
    "new_volume_added_with_gp2",
    "instance_template_defaults_to_gp2"
  ]
});

What I love about this schema

This schema encodes a ton of decision-making. The service team isn't just describing a feature, they're describing when it's appropriate to interrupt someone about it. The confidence_threshold, suppression_rules, and retrigger_conditions fields mean the team has thought carefully about the user experience before the recommendation ever appears.

UFIL then evaluates the incoming context against all registered metadata, runs a match function across 28 candidate features for this workflow, and computes a confidence score of 0.91 for the gp3 suggestion clearing the threshold. Twenty-seven other features are filtered out due to intent mismatch, unmet prerequisites, or prior user dismissals. The one that remains gets rendered as a single inline banner directly below the gp2 volume row, where the decision is actually happening.

Once UFIL decides to surface something, it hands off a render instruction to the console:

aws.discovery.render({
  feature_id: "ec2.ebs.gp3_upgrade_suggestion",
  placement: "configure_storage.inline",
  style: "cost_savings_banner_teal",
  payload: {
    title: "Switch to gp3 and lower your storage costs",
    description: "Same performance as gp2 with lower pricing.",
    estimated_savings: "$2–$6 per month",
    primary_action: "Switch to gp3",
    secondary_expand: "View more insights"
  }
});

A banner then appears not at the top of the page, not in a sidebar, but right inside the container where the decision is happening. This banner shows estimated monthly savings, offers one primary action, and also gives the user three options: snooze it, see details, or permanently dismiss it.

That permanent dismissal is important. When a user dismisses something in UFIL, the system learns from it. It never shows that recommendation again for that resource type unless the user explicitly re-enables it. The suppression data feeds back into the relevance model. Over time, the platform gets quieter for users who want quiet, and more informative for users who want to explore.

Cross-service & customer control

One of the more technically interesting parts of UFIL is the cross-service pattern detection. AWS features don't exist in isolation, they work in combinations. EC2 + Auto Scaling Groups + Application Load Balancer is a pattern. VPC + Flow Logs + CloudWatch is another pattern.

Presently no single service team owns those patterns, which means customers assembling their architectures across multiple services often won't receive individual service's notifications.

UFIL sits above all of that. The intent engine recognizes multi-service patterns and can surface features that span service boundaries, something that would be impossible if each team continued building in isolation.

We're also currently exploring three discovery modes to provide customers real control over their signal-to-noise ratio, rather than forcing everyone through the same experience.

balanced the default which combines contextual suggestions with periodic updates

focused only high-confidence, intent-matched recommendations

exploratory the most open-ended, for customers who actively want to browse what's new across AWS

Conclusion

UFIL is the sort of project one might write off as simply "designing prettier notifications." But in reality, the bulk of the design work actually exists under the hood– aligning the metadata, confidence scoring, and intent categories to actual user behaviors.

Let’s chat

Back