Right time, right place

Right time, right place

Building an intent-aware recommendation engine across 200+ AWS services

Building an intent-aware recommendation engine across 200+ AWS services

Amazon

2025-present

tldr

tldr

Product

Product

AWS console features

AWS console features

Role

Role

Research and design lead

Research and design lead

Problem

Problem

AWS customers miss useful features because hundreds of service teams surface notifications independently, with no shared intelligence underneath.

AWS customers miss useful features because hundreds of service teams surface notifications independently, with no shared intelligence underneath.

Outcome

Outcome

I proposed and am now designing UFIL, an intent aware recommendation layer designed to work across 200+ AWS services, using shared metadata, real time behavioral signals, and a single high confidence recommendation slot to reduce noise and improve relevance. 

I proposed and am now designing UFIL, an intent aware recommendation layer designed to work across 200+ AWS services, using shared metadata, real time behavioral signals, and a single high confidence recommendation slot to reduce noise and improve relevance. 

Overview

Overview

The push for a quiet console

The push for a quiet console

I'm currently leading research and design for a project we're calling the Unified Feature Intelligence Layer (UFIL). UFIL is a standardized metadata schema that every AWS service team will use when launching new features.

My work focuses on creating a shared framework that treats the entire console as a single, coherent experience rather than a collection of independent services.

I'm currently leading research and design for a project we're calling the Unified Feature Intelligence Layer (UFIL). UFIL is a standardized metadata schema that every AWS service team will use when launching new features.

My work focuses on creating a shared framework that treats the entire console as a single, coherent experience rather than a collection of independent services.

Problem

Problem

Too much noise

Too much noise

UFIL started with a problem that's all too familiar for most enterprise platforms– notification deafness.

AWS customers are drowning in notifications. Flashbars, banners, "What's New" tags, custom alert systems built independently by hundreds of service teams.

Despite all the notifications, customers still can't find features that would genuinely help them. We kept hearing the same thing in our research: "Hmm, I didn't know AWS had that."

As my high school Spanish teacher used to say– that's no bueno.

UFIL started with a problem that's all too familiar for most enterprise platforms– notification deafness.

AWS customers are drowning in notifications. Flashbars, banners, "What's New" tags, custom alert systems built independently by hundreds of service teams.

Despite all the notifications, customers still can't find features that would genuinely help them. We kept hearing the same thing in our research: "Hmm, I didn't know AWS had that."

As my high school Spanish teacher used to say– that's no bueno.

Discovery

Discovery

Timing is everything

Timing is everything

Multiple org-wide internal research studies pointed to the same finding: the vast majority of AWS customers were unaware of features that could have saved them significant time and money.

The core insight isn't that we surface too many features. It's that we surface the wrong ones, at the wrong time, in the wrong place.

Once I framed the problem that way, the design challenge changed from notification design to recommendation quality.

That then led me to focus on the decision layer underneath the UI. What signals actually matter in the moment? What conditions should be true before we interrupt someone? How can service teams describe their features in a consistent way without hardcoding custom logic every time?

Multiple org-wide internal research studies pointed to the same finding: the vast majority of AWS customers were unaware of features that could have saved them significant time and money.

The core insight isn't that we surface too many features. It's that we surface the wrong ones, at the wrong time, in the wrong place.

Once I framed the problem that way, the design challenge changed from notification design to recommendation quality.

That then led me to focus on the decision layer underneath the UI. What signals actually matter in the moment? What conditions should be true before we interrupt someone? How can service teams describe their features in a consistent way without hardcoding custom logic every time?

Solution

Solution

Declarative design and intent

Declarative design and intent

UFIL has two core pieces. First, a standardized metadata schema that every service team publishes for their features. Instead of hardcoding notification logic, teams describe their feature declaratively: who it's for, what prerequisites are needed, when it's appropriate to surface, and when to stay quiet.

Second, a lightweight intent detection engine that reads real-time behavioral signals from the console. What resources a user has open, what actions they've just taken, whether they're in an error state. The engine classifies the user's intent in the moment and filters the full catalog of features down to one recommendation. Not a feed. Not a panel. One thing. And only if the confidence score clears a threshold. Below that line, nothing appears.

UFIL has two core pieces. First, a standardized metadata schema that every service team publishes for their features. Instead of hardcoding notification logic, teams describe their feature declaratively: who it's for, what prerequisites are needed, when it's appropriate to surface, and when to stay quiet.

Second, a lightweight intent detection engine that reads real-time behavioral signals from the console. What resources a user has open, what actions they've just taken, whether they're in an error state. The engine classifies the user's intent in the moment and filters the full catalog of features down to one recommendation. Not a feed. Not a panel. One thing. And only if the confidence score clears a threshold. Below that line, nothing appears.

UFIL in action

Here's a common scenario: a customer launches a new EC2 instance, scrolls down to configure storage, and leaves the default volume type set to gp2. That volume type is both more expensive and lower performing than the newer gp3. Today, AWS shows nothing. No hint, no flag, no suggestion.

Here's a common scenario: a customer launches a new EC2 instance, scrolls down to configure storage, and leaves the default volume type set to gp2. That volume type is both more expensive and lower performing than the newer gp3. Today, AWS shows nothing. No hint, no flag, no suggestion.

With UFIL, the EC2 console fires a context event the moment the user selects gp2:

With UFIL, the EC2 console fires a context event the moment the user selects gp2:

aws.discovery.registerContext({
  resourceType: "ec2.instance.launch",
  resourceCount: 1,
  intent: "configure_storage",
  action: "select_volume_type",
  selections: {
    volumeType: "gp2",
    volumeSizeGiB: 100,
    instanceFamily: "m6i",
    region: "us-west-2"
  }
});

That event tells UFIL something important: this user isn't browsing. They're actively configuring storage and have explicitly selected gp2.

In background, UFIL analyzes the standardized metadata the service team has already published for the gp3 upgrade recommendation:

That event tells UFIL something important: this user isn't browsing. They're actively configuring storage and have explicitly selected gp2.

In background, UFIL analyzes the standardized metadata the service team has already published for the gp3 upgrade recommendation:

aws.discovery.registerFeature({
  feature_id: "ec2.ebs.gp3_upgrade_suggestion",
  service: "ec2",
  version: "2025-01-15",

  target_personas: [
    "cost_optimized_user",
    "new_ec2_user",
    "ops_engineer",
    "general_builder"
  ],

  prerequisites: {
    requires_volume_type: ["gp2"],
    max_volume_size_gib: 16000,
    forbidden_volume_types: ["io1", "io2", "sc1", "st1"],
    required_permissions: ["ec2:ModifyVolume"]
  },

  recommended_timing: [
    "when_user_selects_gp2",
    "when_user_expands_configure_storage",
    "before_launch_review"
  ],

  intent_category: ["cost_optimization", "performance_equivalent_choice"],

  confidence_threshold: 0.82, // only show when very confident

  suppression_rules: {
    permanent_if_dismissed: true,
    snooze_period_days: 30,
    hide_if_customer_prefers: ["gp2_for_snapshot_seeding"]
  },

  cost_fingerprint: {
    gp2_price_per_gb: 0.10,
    gp3_price_per_gb: 0.08,
    estimated_monthly_savings_range: "$2–$6"
  },

  retrigger_conditions: [
    "user_reselects_gp2_after_snooze",
    "new_volume_added_with_gp2",
    "instance_template_defaults_to_gp2"
  ]
});

This schema encodes a ton of decision-making. The service team isn't just describing a feature. They're describing when it's appropriate to interrupt someone about it. The confidence threshold, suppression rules, and retrigger conditions mean the team has thought carefully about the user experience before the recommendation ever appears.

UFIL evaluates the incoming context against all registered metadata, runs a match function across 28 candidate features for this workflow, and computes a confidence score of 0.82 for the gp3 suggestion.

Twenty-seven other features are filtered out due to intent mismatch, unmet prerequisites, or prior user dismissals. The one that remains gets rendered as a single inline banner:

This schema encodes a ton of decision-making. The service team isn't just describing a feature. They're describing when it's appropriate to interrupt someone about it. The confidence threshold, suppression rules, and retrigger conditions mean the team has thought carefully about the user experience before the recommendation ever appears.

UFIL evaluates the incoming context against all registered metadata, runs a match function across 28 candidate features for this workflow, and computes a confidence score of 0.82 for the gp3 suggestion.

Twenty-seven other features are filtered out due to intent mismatch, unmet prerequisites, or prior user dismissals. The one that remains gets rendered as a single inline banner:

aws.discovery.render({
  feature_id: "ec2.ebs.gp3_upgrade_suggestion",
  placement: "configure_storage.inline",
  style: "cost_savings_banner_teal",
  payload: {
    title: "Switch to gp3 and lower your storage costs",
    description: "Same performance as gp2 with lower pricing.",
    estimated_savings: "$2–$6 per month",
    primary_action: "Switch to gp3",
  }
});

A banner then appears not at the top of the page, not in a sidebar, but right inside the container where the decision is happening. This banner shows estimated monthly savings, offers one primary action, and also gives the user three options: snooze it, see details, or permanently dismiss it.

A banner then appears not at the top of the page, not in a sidebar, but right inside the container where the decision is happening. This banner shows estimated monthly savings, offers one primary action, and also gives the user three options: snooze it, see details, or permanently dismiss it.

Impact

Impact

A quieter console

A quieter console

UFIL is still in progress. But the framework is already changing how we think about the console. Before, no team could recommend features that spanned multiple services. Now they can.

The shared metadata gives every team a common language for when to surface something. And the single-slot constraint means the console gets quieter over time, not louder.

UFIL is still in progress. But the framework is already changing how we think about the console. Before, no team could recommend features that spanned multiple services. Now they can.

The shared metadata gives every team a common language for when to surface something. And the single-slot constraint means the console gets quieter over time, not louder.

UFIL is still in progress. But the framework is already changing how we think about the console. Before, no team could recommend features that spanned multiple services. Now they can.

The shared metadata gives every team a common language for when to surface something. And the single-slot constraint means the console gets quieter over time, not louder.

Reflection

Reflection

The design beneath the surface

The design beneath the surface

Working on this project is a reminder that sometimes the most important design decisions happen below the surface.

Most of the design work on this project has nothing to do with what the customer sees. The banner is simple. The hard part is the metadata schema underneath it, and getting hundreds of service teams to describe their features the same way.

That's a systems thinking problem. When should a recommendation appear? When should it stay quiet? When should it never come back? Those decisions get made in the schema, not the UI.

Working on this project is a reminder that sometimes the most important design decisions happen below the surface. The banner is simple.

The hard part is the metadata schema underneath it, and getting hundreds of service teams to describe their features the same way. It's a systems problem.

When should a recommendation appear? When should it stay quiet? When should it never come back? Those decisions get made in the schema, not the UI.

Working on this project is a reminder that sometimes the most important design decisions happen below the surface.

Most of the design work on this project has nothing to do with what the customer sees. The banner is simple. The hard part is the metadata schema underneath it, and getting hundreds of service teams to describe their features the same way.

That's a systems thinking problem. When should a recommendation appear? When should it stay quiet? When should it never come back? Those decisions get made in the schema, not the UI.

Back to top

Back to top