When the Structure Becomes the Culture

Why micro teams and rotation reshape culture, not just throughput, in modern SRE. Most SRE leaders design teams around the systems they own. We designed ours around movement. We introduced micro teams expecting a throughput story: smaller groups, tighter scope, faster work. Some of that arrived. What we had not budgeted for was how much […]

The Flow Illusion: Why Transformation Feels Like Theatre

You’ve invested in the tools. Your teams have dashboards that track cycle time, throughput, and work in progress. You’ve likely even built a sophisticated, probabilistic roadmap. Yet, despite the data, it feels like theatre. The moment the workshop ends, that roadmap becomes a static slide deck, and teams remain paralyzed, waiting for you to update […]

Flowtopia Live 2026: Engineering the Frictionfree Enterprise

If your goals are accelerating flow and maximizing value in your organization, consider grabbing yourself a ticket to this year’s Flowtopia Live on Wednesday, June 24th. Flowtopia is a community of value stream practitioners, and this is our annual online jamboree where we gather to share, learn, and celebrate all things flow-related. Over 12 hours, […]

AI Is Accelerating DevOps, Poor Integrations Are Slowing It Down

As AI speeds up software delivery, the real bottleneck isn’t scanning or CI. It’s how safely and predictably change moves across tools, teams, and companies. Something strange is happening in DevOps right now. AI copilots are writing code, generating tests, triaging incidents, and even summarizing pull requests before a human looks at them. The tooling […]

Agentic DevSecOps: AI Security Co-Pilots for Your CI/CD Pipeline

The emergence of AI has brought endless possibilities and innovative opportunities in today’s ever-changing, fast-paced technology landscape. AI is helping development teams produce software significantly faster than ever before. AI-enabled DevSecOps tools can automatically scan code, infrastructure and other configurations for security issues throughout development, accelerating the overall…

Risk-Based Review for Infrastructure as Code Pull Requests

Not every infrastructure pull request deserves the same review path. A tag change in a development account and a network-policy change in production should not create identical reviewer load. When every change is treated as high risk, reviewers stop trusting the signal. In IaC review, I have seen reviewers spend too much attention on low-risk changes […]

The Death of the Four Golden Signals: Designing Telemetry for Non-Deterministic Infrastructure

In complex software systems, our traditional definition of operational health has always been comfortably binary. For over a decade, site reliability engineering (SRE) teams have relied on the industry-standard ‘Four Golden Signals’ — latency, traffic, errors and saturation — as the ultimate truth of platform stability. If our API-response times are hovering at sub-100 ms, […]

The Silent Risk of AI-Written DevOps Pipelines

These days, when a developer needs a CI/CD pipeline, they don’t always dive into GitHub Actions docs or spin up Jenkins from scratch. Instead, they pull up an AI assistant and type out something like: “Create a deployment pipeline for a containerized application.” Seconds later, the AI spits out a complete workflow. It looks polished. […]

Why DIY Test Automation Succeeds Its Way Into a Problem

Ask any engineering team if they can build their own test automation framework, and the answer is almost always “yes.” With modern AI tools involved, that answer arrives faster and with more confidence than ever before. In 30 days, a capable team can spin up scripts, automate flows, generate test cases, and show a demo […]

Regression Testing Tools in the Age of AI-Assisted Development: What Has Changed

For most of the past decade, the conversation around regression testing tools was fairly stable. The tools got faster, the integrations got smoother, and the underlying approach stayed largely the same: write tests, run them in CI, fix failures. The fundamental model did not change much because the problem did not change much. AI-assisted development […]

Overcoming IP Churn in Ephemeral DevOps Environments Using Userspace Overlays

Modern DevOps practices have completely transformed how we handle compute and orchestration. Tools like Kubernetes enable engineering teams to spin up ephemeral containers in seconds and scale workloads dynamically to meet global demand. Yet the underlying network infrastructure has remained stubbornly rigid. Traditional cloud networking relies heavily on static IP addresses, rigid firewall…

Why Enterprise AI Infrastructure Is Becoming a DevOps Problem

Most enterprise AI projects start with retrieval. You connect Jira, Confluence, SharePoint, and Slack. Maybe a few internal databases nobody has touched in five years. You tune embeddings, optimize chunking, wire up a vector database, and convince yourself you’ve built an AI-powered knowledge system. Then the model server crashes. And suddenly, you discover the uncomfortable […]

Why Your AI Agent is a Black Box and How to fix it With OpenTelemetry

You built the agent. It works in testing. Then it hits production and starts giving wrong answers, timing out or burning through your token budget, and you have no idea why. This is when developers discover that print statements and log files weren’t designed for this. LLM applications fail in ways that traditional tooling can’t see. A hallucination doesn’t throw […]

Why AI Won’t Solve the Hardest Part of Integrations

AI is making it easier for SaaS companies to build integrations. Give a coding agent decent API docs, some context about the systems involved, and a clear prompt, and it can get surprisingly far. It can write the logic faster than most teams could a year or two ago, saving time, reducing repetitive work, and […]

The End of Alert Fatigue: How AI-Powered Observability is Transforming SRE Teams in 2026

Alert fatigue among Site Reliability Engineering (SRE) teams has reached a breaking point, with responders drowning in thousands of weekly notifications where only 3% genuinely warrant attention. This massive volume of noise—driven by fragmented monitoring tools and rigid, threshold-based alerting—stifles innovation, spikes on-call burnout, and compromises system reliability. Fortunately,…

Read more →
5 Ways Agentic AI is Redefining DevOps Architecture for Self-Healing CI/CD Systems

The era of the flaky test as a simple annoyance is over. As enterprises shift from deterministic applications to agentic AI, flakiness has evolved into a structural bottleneck for traditional CI/CD pipelines reliant on rigid, binary assertions. Because AI agents produce "Y-like" rather than exact results, DevOps architecture must fundamentally change. This article explores the transition from…

Read more →
On-Call: The Silent Force Shaping Engineering Culture

There is a silent force shaping engineering culture inside every technology organization. It affects productivity, team morale, psychological safety, and long-term retention. And yet, it is rarely discussed in executive meetings or reflected in meaningful KPIs. That force is on-call. On-call is one of the most direct touchpoints engineers have with the reality of the […]

Why DORA Metrics Look Different When AI Is Part of Your Development Workflow

DORA metrics have been a reliable compass for engineering teams for over a decade. Deployment frequency, lead time for changes, change failure rate, mean time to recovery, and reliability give teams a shared language for talking about delivery performance. The research behind them is solid, the benchmarks are well-established, and most engineering leaders know what […]

AI Agents in CI/CD Pipelines: Speed vs Control in Modern DevOps

The moment you push your code, deployment fires off on its own. The pipeline kicks in, the tests sail through, and within a few minutes your app is live in production. There is no manual sign-off and no one scanning through the final changes. Everything is running on the decisions of an AI agent plugged […]

Page 1 Older →