Why Your AWS Bill Is Still High - Even After Checking the Obvious

Why Your AWS Bill Is Still High - Even After Checking the Obvious

It’s a familiar moment for many AWS users:
You open the bill expecting EC2, RDS, EKS, or Bedrock to be the culprits…
…but the numbers don’t add up.

Even after accounting for your major services, your AWS costs remain surprisingly high.

This is more common than you think.

Modern AWS environments hide dozens of small, silent cost drivers that add up month over month. The problem isn’t just compute or databases - it’s the surrounding services, configuration drift, and “always-on” defaults that quietly inflate spend.

Below, we break down the most frequent hidden or overlooked AWS cost traps we see in real customer environments, and how you can identify (and eliminate) them.


1. The Rise of “Shadow Storage”

Most teams look at S3 storage totals - but overlook the hidden multipliers:

a. Old S3 multipart uploads

Abandoned multipart uploads can accumulate hundreds of GB over time.

b. S3 versioning gone wild

Versioning creates unlimited copies of objects. A 1 GB log file overwritten daily can quietly become a 30 GB monthly surprise.

c. Cross-region replication

Two buckets, doubled cost.
Three buckets, tripled.

d. S3 Glacier retrieval spikes

Backup software or misconfigured policies can trigger expensive retrievals you never intended.


2. Data Transfer - The Silent Budget Killer

Data transfer is the #1 “mystery cost” for most AWS users.

Common offenders:

a. Inter-AZ traffic (not free!)

Many architectures replicate traffic across AZs unexpectedly - especially:

  • load balancer → EC2
  • EC2 → RDS
  • ECS tasks → other tasks
  • EKS cross-AZ pod communication

We’ve seen inter-AZ fees exceed EC2 instance cost in some workloads.

b. NAT Gateway tax

At $0.045/GB, large egress through a NAT Gateway becomes a silent cost avalanche.

c. CloudFront → origin fetches

Poor cache TTL settings or dynamic content can cause thousands of unnecessary origin pulls.

d. VPC endpoints (charged per hour + per request)

Some users believe they save money with endpoints - yet certain patterns end up costing more than simply using NAT.


3. Logs, Metrics, and “Invisible” Observability Costs

After compute, the biggest hidden cost category is almost always observability.

a. CloudWatch Logs retention

Most defaults never expire logs — leading to multi-TB log buckets.

b. CloudWatch custom metrics

Each custom metric costs money every single month.
High-cardinality labels explode pricing.

c. CloudWatch Logs “payload tax”

JSON logs from chatty microservices generate enormous ingestion volumes.

d. X-Ray sampling or tracing defaults

A single high-traffic service can generate huge amounts of trace data.

e. Third-party log shippers

If you push logs to multiple tools (Datadog, Splunk, Elastic), you’re often paying 2–3 times for the same data.


4. Old Snapshots, AMIs, and Other Orphaned Resources

AWS never deletes things for you unless you explicitly tell it to.

Common forgotten artifacts include:

  • EBS snapshots (especially automated ones)
  • AMIs created during deployments
  • Lambda layer versions
  • Old ECR images
  • Elastic IPs not attached to anything
  • DynamoDB tables used for past test environments
  • Redshift clusters paused but not deleted

Snapshots, in particular, often accumulate years of incremental backups.


5. Overprovisioned “Support Services” Nobody Monitors

Even when EC2 and RDS look fine, the supporting services may not.

a. Overprovisioned load balancers

ALBs cost ~$20–30/month plus LCU charges.
Many companies have dozens they forgot about.

b. Queues with high throughput settings

SQS and SNS costs scale with request volume - sometimes artificially inflated by retry storms.

c. Step Functions charged by state transition

A poorly designed workflow can cost 10–50× more than expected.

d. API Gateway

High request volume, poor caching, or VPC integrations can dramatically increase cost.

e. DynamoDB

Provisioned throughput often remains high long after traffic decreases.


6. Orphaned Kubernetes and ECS Infrastructure

Even if compute workloads are accounted for, the ecosystem around them may not be:

a. EKS cluster control plane costs

Even the smallest EKS clusters incur a base charge.

b. Unused node groups or Fargate profiles

Clusters often keep scaling groups alive even after workloads change.

c. Persistent volumes never cleaned up

PVCs and EBS volumes survive long after pods are gone.

d. Service mesh proxies

High data volume through mesh sidecars increases compute and bandwidth costs.


7. IAM Misconfiguration Leading to “Runaway Services”

Misconfigured IAM roles sometimes allow:

  • Lambda functions to scale without limits
  • Batch jobs to spin up unlimited resources
  • Step Functions to run recursive workflows
  • Glue jobs to fire repeatedly due to failed triggers

All of these quietly run up the bill until someone notices.


8. Legacy Services Nobody Remembers

This is surprisingly common.

Old environments often contain:

  • EMR clusters used once
  • ElastiCache clusters created for tests
  • OpenSearch domains left idle
  • SageMaker notebooks left running
  • Missing lifecycle rules for Athena/Glue results
  • ML inference endpoints that never turn off

One or two forgotten services won’t break the bank — but dozens will.


9. Configuration Drift

Environments change over time:

  • Someone adds a new ALB
  • A developer bumps an instance type
  • A team disables logging retention
  • A Lambda function begins logging 10× more data
  • A NAT Gateway becomes the default path for everything

Most organizations don’t detect drift until costs spike.


10. Tools, Integrations, and Pipelines Working Against You

Third-party tools often generate AWS cost indirectly:

  • CI pipelines constantly pushing artifacts
  • Monitoring agents with high sampling
  • Replicating logs to multiple regions
  • Serverless monitoring platforms adding overhead
  • Backup tools creating redundant snapshots

Individually small - together expensive.


Final Thoughts: The Small Costs Add Up

If you’ve already accounted for EC2, RDS, Bedrock, and other top-level services, the remaining spend is likely hidden across:

  • Storage multipliers
  • Data transfer
  • Logging and observability
  • Orphaned or legacy resources
  • Over-provisioned supporting services
  • Kubernetes ecosystem sprawl
  • Configuration drift
  • Forgotten pipelines and integrations

The key is not just identifying one issue - but systematically reviewing these categories.

Share this post

Know someone wrestling with their cloud? Send this their way and make their life easier.

Turn insight into action

Get a complimentary Cloud Audit

We’ll review your AWS or Azure environment for cost, reliability, and security issues—and give you a clear, practical action plan to fix them.

Identify hidden risks that could lead to downtime or security incidents.

Find quick-win cost savings without sacrificing reliability.

Get senior-engineer recommendations tailored to your actual environment.