How to enable a serverless first cloud centre of excellence

Dave spoke with Jessica Feng (Serverless Business Development, AWS) at AWS re:Invent 2020 on Enabling a serverless-first Cloud Center of Excellence.

Enterprises are embracing serverless to drive continuous innovation, faster time to market, and developer agility as part of broader modernization efforts. While individual teams are building prototypes, many are enabling cross-organizational support through their Cloud Center of Excellence (CCoE). These CCoEs have been a critical mechanism for scaling a serverless-first strategy through an organization by establishing standardized frameworks, best practices, and common patterns.

This session outlines key patterns for a CCoE and how it accelerates serverless adoption. You also hear from Liberty Mutual about its journey to a serverless-first mindset. Leave this session ready to bring a serverless strategy to your organization through a CCoE.

AWS re:Invent 2020

Introducing Jessica Feng

Hi, thanks for joining us today on this session: Enabling a Serverless First Cloud Center of Excellence. I’m Jessica, and I spend a lot of my time working with customers like FINRA, Coca Cola and Capital One on their use of serverless. With serverless, they’re able to build applications faster with the ability to scale to millions of users, have global availability, and respond to their customers in milliseconds. They can help foster innovation throughout their development teams and really accelerate that business agility through their use of serverless.

So how do our customers accelerate business agility through serverless? And how did they get there? In this session, I’ll cover the benefits of moving towards a serverless first strategy, and how to think about where you are along that journey. I’ll share common success patterns we’ve seen and heard from our customers, patterns across technology processes, and people.

Finally, I’ll have Dave Anderson from Liberty Mutual join me. He’ll share how Liberty Mutual got to their serverless first strategy and mindset. Now, there’s a lot of content to cover here. And unfortunately, we won’t be able to go deep on it all. But I do want you to walk away with a mindset and a set of practical guidance to help get you started on serverless today.

Business agility through Serverless

We announced Lambda almost six years ago, and it set off the serverless revolution. We talk a lot about how serverless is the future. What I love about this quote is that the future is here. Hindsight is 2020. So if Lambda and the serverless portfolio can meet the scale, agility, costs, and security needs of our retail business, as we know it today, then that’s what other customers should think about starting.

In fact, over half the applications built by our retail organization were Lambda based. A serverless strategy enables you to maximize the value from the cloud. And by serverless, I’m not just talking about services like Lambda, API gateway, star functions or messaging services. But I’m talking about applications that are built with technologies that eliminate the need to manage servers, scale nearly infinitely, use paper, use a pricing model, and are automatically highly available. And customers are making the decision for a serverless first strategy.

A decision to dedicate serverless as a preference and enable prioritizing the tenets of serverless across their organization. Customers like iRobot, Fender, T-Mobile and Liberty Mutual are all serverless first.

Key drivers for architecture choice

Now there are a few key drivers that customers need to think about when making their initial architecture choice. First, speed to market. You have to move fast and win customers. Customers are able to be hyper efficient with small teams when using serverless. Customers like Fender built a new set of applications, which acquired 40% more content and more paint users. That’s just the cost. With serverless, the cost doesn’t just align with usage, it also lowers your total human operational time. So customers like Siemens haven’t spent one minute of downtime since moving to serverless.

Next, your developers want autonomy. You want to minimize the reliance on other teams and keep that end to end ownership within the core team so you can move as fast as possible.

Finally, you need scalability, flexibility, and low operational burden in order to achieve that quick time to market. Customers like iRobot treat their busiest Christmas periods like it’s business as usual for their engineers, because Lambda automatically scales to meet their demand.

Organic adoption pattern

There are many paths to serverless. However, we’ve seen this pattern emerge most commonly. We refer to it as the organic adoption pattern, because it’s grassroots and it tends to occur without much outside influence. It starts with developers who discover that Lambda can help automate simple processes. They use it for cron jobs, or connecting between services; projects, they can start and finish without any additional permissions.

Over time, they start wondering what else can this principle be applied to. And data transformation is the next common workload. There’s often existing knowledge of serverless within the organization.

The tipping point for organic adoption

This is when the economics behind serverless and how to understand the impacts at a team level becomes critical. This is often where we see the tipping point for organic adoption.

Customers think, “Well, why am I spending time focusing on the underlying infrastructure when it really isn’t important for my business.” When teams start to think about using serverless in a more meaningful way, they begin to establish standard organizational practices, and processes to better enable their developers.

This is where the blend between technology and business strategy begins to blend and we start seeing involvement from the executives. Leadership takes notice and wonder how they can apply a serverless strategy to their broader organization.

But it’s important to look at where you are today along this journey, and how you prioritize what’s critical for your business and organization. Consider, are you working backwards from the customer? Do you have people wanting to pick up new skills? What’s your API strategy? Is technology being used at the core to drive your business goals? How can you move faster to enable developer autonomy? Are you ready to make changes today? Or make changes tomorrow? And be honest with yourself. Where are the gaps for where you want to go? And the goal here is to help you get started and give guidance for what you can do today.

Cloud Center of Excellence

So what does this mean for a Cloud Center of Excellence? As our customers progress through this adoption lifecycle, and you may have seen this developer organic adoption within your teams, it’s not always as smooth as you’d like. There’s a different mindset and a different operational model that you need to think through. We’ve seen that Cloud Centers of Excellence can play a critical role at enabling and accelerating that Serverless First Strategy.

So Cloud Centers of Excellence, or CCOEs are typically a multidisciplinary team. They implement governance, best practices, enablement, and architecture needed for cloud adoption in a way that provides a repeatable pattern for the larger organization to follow. I’m actually using CCOE rather loosely here. Some organizations have these official centers of excellence. Others have enterprise architects, platform teams, even DevOps teams. The role that I’m describing here today is one that owns and establishes operational practices. They provide guidance on tooling, and they ensure that governance and compliance is adhered to.

These are the teams that want a consistent way to apply or wide guardrails that have centralized mechanisms, enabling development teams through sample applications or templates. So CCOEs need to focus on the needs of the business and their internal IT customers. Their set of responsibilities is really broad.

Establishing a serverless first mindset is an iterative procees

Establishing a serverless first mindset is not instantaneous, or even one single operation. It’s an iterative process. But across each set of these responsibilities, we see common patterns emerge. And so each one of these topics definitely deserves a deep dive. But what we’ll share today is common patterns of success, and break down areas where you can get started.

Regardless of where you are on your serverless first journey, we found very common patterns across our customers. First, organizing people and processes. This is often driven by an executive sponsor to be that bellwether for serverless. Second, establishing operational principles to readily take advantage of new technologies.

This includes practical guardrails to both enable your developers to move as quickly as possible, but also reduce the occurrence and the blast radius of undesirable behavior.

Finally, encouraging pilots early and often. This means bringing to life that collaboration between operations and development, and that organizational governance and toy that you set into place.

Let’s get started with the executive sponsor. It’s essential to secure the right level of support in establishing a serverless first strategy. Many customers create executive steering committees. Those who can serve as the North Star for the business, and more specifically, they can influence corporate mandates and business strategy.

Also tying that back to how and what and where development teams should deliver. That’s the organizational direction that’s required. Now, some developers might ask, “Well, when do I use serverless versus containers?” But that’s not quite the right focus we should be thinking about.

Amazon’s Two Pizza Teams

You should be thinking about what’s most important to you? Where does your business want to be in three to five years? And how can you keep up the pace of innovation as demanded by your customers? Having that bellwether for serverless often leads to people’s decisions and organizational changes.

You may have heard about Amazon and our two pizza teams. Teams aligned behind the API of an app and own that app end to end. These teams are more agile. They have more ownership and if they write the code, they own it. But what does two pizza teams actually look like in real life? We found that our most successful teams are formed with a mix of developers, those who are writing and operating those applications, a technical lead to influence and guide the direction of the team, and a program manager, someone who’s responsible for coordinating the sprint planning and reporting on activities related to the use case.

Standardize operational practices and boundaries

The next common pattern to help CCOEs is to standardize operational practices and boundaries. There are two areas typically prioritized by customers. First is security and second is the developer experience. Today, engineering teams are often faced with a variety of standards, policies, and tooling that can create friction when they’re trying to build their applications. We’ve seen many of these customers resolve that friction by working hard to define a paved path for their development teams, and reduce the amount of time to get code from idea to production.

Successful examples of this includes aligning existing processes, skill sets, and culture with patterns of automation, integration, continuous delivery, and monitoring.

With serverless architectures, our customers are able to innovate quickly by focusing on the business logic, trusting AWS to own more of the operational shared responsibility. With serverless, that shared responsibility shifts. So you can focus higher in the staff on security in the cloud.

You are no longer responsible for security of the cloud

That means you’re no longer responsible for security of the cloud, such as OS or code runtime patches, network traffic, or firewalls. Customers who have an API strategy see serverless as inherent in hand with their serverless strategy. With serverless, you’re primarily responsible for security of your code, the storage and accessibility to sensitive data, and the identity and access management. These permissions and restrictions are handled through the API. So each unit of code is exposed only to the vulnerabilities and its specific logic and dependencies. It has access only to its own resources.

But there are many similarities between serverless and traditional architectures in all the best practices and taking responsibility at the application level. Managing, authentication and authorization, data encryption and integrity, and monitoring inside and outside of your app still holds true. It’s critical for you to ensure that the systems in place are followed each and every time.

Developers start to focus on business logic

It takes a little bit of a different perspective with serverless to think through, whether a particular security risk is being handled by AWS, or by a system within your stack. Building more and more with serverless means that developers begin an application by focusing on the business logic of the code. Not only does this give additional time to market agility, but it allows developers to innovate more as a team, meet these agility teams and to look at the standardization and integration of CI/CD pipelines.

Now, agile development isn’t new for serverless. But using code to model your application and infrastructure is particularly important as customers begin to build and release more regularly. Deployment frequency changes from weekly or monthly to hourly or daily, and the lead time for development changes from months to days.

This means that the processes to checkpoint quality and to integrate work needs to be automated as much as possible, using tools like Jenkins or Terraform, or AWS services like Code Build and Code Pipeline.

Tools for the full development lifecycle

Customers need tools that provide the full development lifecycle. So beyond CI/CD pipelines, customers need to consider provisioning, testing, debugging, using service and tools like Spinnaker or AWS Sam. In fact, many of our customers have standardized on frameworks like AWS Sam, or the serverless framework. And the increased rate at which developers are writing applications with serverless, it’s critical that these decisions are put into place sooner rather than later.

Finally, the value of integration testing increases. With serverless apps, your app is composed of many more components, and all those components really need to play nicely together. So similar to security, having the right systems, tooling and testing in place will help you align better with your development process.

Monitoring and managing code in production

The third common pattern in addressing operational standards is around monitoring and managing your code in production. It’s important to note the context in terms of how applications are evolving. Namely, the customer experience is changing and more important than ever, resources of your applications are increasingly short lived and ephemeral. There are many more connected parts of that application and the whole application is evolving at a faster rate with more deployments, often to more locations.

As applications evolve, monitoring needs to do a lot more than just watch the layers of the stack. With serverless, the burden of monitoring partly shifts to AWS. So you can focus on the business logic of the application code and data while also having better business visibility. Knowing that your application is performing correctly, means monitoring beyond just failures. Is your application actually performing as expected? Are your customers getting the user experience that they expect and that you want to give them? What’s the usage? Are you hitting limits or experiencing latency? And finally, what’s the business and revenue impact? What trends can you visualize and potentially plan for?

Our customers are evolving their thinking around monitoring. Having visibility into the system as a whole, not just knowing that there’s an issue, but why it’s happening and creating a path for resolution is increasingly important.

Fender’s tool chain and frameworks

Now, this is just the technical aspect of managing your code in production. Successful customers have seen changes more broadly in their organization in terms of how they’re thinking about performance, failures, data collection, and the practices behind those. So what does this all look like in real life? Here’s the view of Fender’s tool chain and frameworks. Fender realized early on that they need to look across their operational processes in order to build and deploy their serverless micro service architecture.

For their new product Fender Play, they had a chance to set up an entirely new architecture, and to support the API’s required. So there was a luxury of starting from scratch with them. They realized that while many aspects of infrastructure management and maintenance are handled and owned by AWS, they still needed to automate the connection between different components and to maintain that track of changes. They experimented with a variety of tools, and this is what worked best for their use case.

AWS Well Architected Framework

The last common pattern of success is around getting hands-on experience with launching pilots into production. We’ve talked a lot about the value of serverless in enabling innovation. Innovation requires change and experimentation. And we think it’s easiest to do that when you can do lots of experiments and discover what works, and then iterate on areas that need to be addressed.

We’ve seen customers leverage the AWS Well-Architected Framework every step of the way. From ideation to alpha versions to right before launch. The Well-Architected Framework, and specifically the serverless lanes, presents a set of foundational questions to see how your architecture aligns with best practices, and provides guidance for making improvements.

Take the opportunity to set yourself up in the right direction. Each pilot should be measured with KPIs and metrics as an objective way to evaluate how things performed before and after. Share those with your team, whether it’s cost savings, time savings, it draws interest to how you are building differently. And document those lessons learned how you got to those success metrics.

Set pilots to help accelerate adoption of serverless

Now, this seems really obvious, but it’s important to notice and remember that the pilot has to make sense for the entire organization. These pilots often set the foundations for scaling and standardizing on reusable patterns. And these patterns can help accelerate adoption of serverless outside of the initial pilot team, and provide a better experience for those across the entire organization.

Finally, we see our most successful customers build a community of internal evangelists. Find those that are passionate about the technology, it can help engage and work with the broader team. Enabling a serverless first strategy is a journey. Customers like Comcast and Capital One and startups like Amenity have really seen the benefits of their service first approach. While Coke and McDonald’s continue to move forward and build serverless.

You may be thinking, I’m not quite sure how to get there from where I am today. Whether you’re starting cloud, even with DevOps, or on enterprise with on prem data centers, it’s about where you want your business to be in the next three to five years, and how to get started on that journey. And how can we reduce the amount of tech debt that you’re creating within your organization so that you can maximize the time and value for innovation?

Cloud Centre of Excellence on The Serverless Edge — Photo by Redd on Unsplash.com

Liberty Mutual and Serverless First Mindset

To that end, I’d like to introduce Dave Anderson, Director of Technology at Liberty Mutual. Dave will share the journey of Liberty Mutual to an organizational wide serverless first mindset. He’ll provide real life experiences with serverless and give practical guidance on the benefits of adopting within your organization. Please welcome Dave Anderson.

Dave

I often wonder why a large old insurance company is interested in tech like serverless.

Liberty Mutual protects millions of people across the globe and when you call upon us, we need to be ready. We can’t afford to waste time with tech that doesn’t work. I like to think, do we have a different North Star? We think of insurance differently, we are future facing. We think about the future of shelter, which protects how you live. The future of mobility, which protects how you move around, and the future of commerce, which protects how you work. We don’t just sell policies, we think differently, and at a global level.

Wardley Mapping

Today, I’ll explain why we got to a serverless first Cloud Center of Excellence. My team and I are huge fans of Wardley mapping. And we have been doing so for many years. But around five years ago, we started thinking far ahead. How could we maximize the four values you see listed here? Innovation, speed, value for learning? How do we ingrain these for ourselves and our business partners?

We mapped the landscape and decided not to deep dive into the DevOps container. Because we needed a serverless first strategy. We didn’t know what it was called yet, but we knew what it felt like.

Liberty Mutual’s six step journey

I’d like to deep dive into six steps along our journey and explain our service founders. Our original goal wasn’t serverless, it was fast feedback,and that means people. Before the book, Team Topologies even came out, we created an enablement team. I believe a collaborative network is the best way to get deep into the organization. Good engineers are influenced by other good engineers; by their peers. Our team of expert enablers, expert builders, if you will, was able to connect with the best teams and help them solve problems and explore the best way to build cloud native systems.

That’s co-creation.

At Liberty Mutual, I believe we have a culture of experimentation, and innovation. We started this serverless journey in 2015. We started to see results in early 2017 and right through to today. Our pioneering teams created these significant solutions using a serverless first approach.

Liberty Mutual’s Serverless Solutions

Our virtual assistant is a service natural language processing system for resolving many queries. It can now handle around 200,000 calls a month for four cents per call, and provide positive customer satisfaction. That’s with no wait time.

Our financial central services solution is helping move towards a single global general ledger that can process a million transactions per $60 and usually handles around 100,000,000 transactions a month. It uses AWS Step Functions to ensure that not a single transaction is dropped. With a global digital ecosystem, we’re driving a global book of business. Serverless allows us to reduce operational burden and have a global presence. This means we can focus on growing our business.

The worker digital assistant was originally an internal project that we moved into a startup. This is an entire serverless first company. It was built by eight engineers with robust enterprise compliance. There’s no way we could have achieved that with any other approach than serverless.

Finally, ID3E Intelligent Documents. This is something quite new and innovative, which helps document understanding and reduce repetitive work. We built this in 12 weeks with a cross functional team,with only a few engineers. Serveless allowed us to experiment at low cost.

Executive North Star

It was critical to our serverless strategy that we had an executive north star. In 2016, our CIO James McGlennon created our Technology Manifesto. We had already started our cloud journey around 10 years ago, but we started to scale in 2016, driven by this Manifesto. When we started writing this Manifesto, it very clearly signposted the modern way of building applications. And in the second strategy document in 2018, we stated that the future of the cloud was serverless. We put this out in 2019 and it describes how a serverless strategy can reduce operational burden. Just last year, we started to call it a serverless first approach.

Part of my role at Liberty Mutual is to ensure that the leadership narrative is tied to the engineering teams, to define a strong and compelling vision of the future. At this stage, many of our governance functions are fully on board. Our security team has been awesome.They have really embraced this thinking.

The ambassador role is crucial to ensure that the technical strategy is baked into the leadership narrative. It’s also great to hear our executives talk and get excited about serverless. Though we need to accelerate the engineers, I think reference architectures are quite good, but quickly realized that serverless landscape changes so fast and assumes different ways to build. There’s no one reference architecture that works. We love the Lego analogy for a client and decided that reusable patterns was the way forward.

CDK Patterns

Last year, we jumped on CDK; Cloud Development Kit. When it was launched, we got really excited. One of my team members, Matt Coulter, decided to create an open source project called CDK Patterns.com. This was a great way for us to explore our ideas, and also give something back to the community. We were productive in the service community anyway. So we started a global CDK community, and celebrated CDK day last September. It was a really awesome, fantastic day.

What we do is we take the CDK Patterns from the external site and bring them back in-house to your internal platform. We call it the software accelerator. By adding compliance and security rules, it is easy for any engineer from Liberty Mutual to spin up a collection of serverless components in minutes.

Time for the settlers and town planners

We know about 3500 applications deployed within Liberty Mutual using these patterns. At this stage, we have a great foundation, or we don’t? No, next thing to engage all of the engineers. The pioneers have gone already, it’s time to bring in the settlers and town planners. We need to give them confidence, we need to lower the barrier to entry and remove the fear factor.

The first thing I did in my area was encourage AWS Certification, or an architect or developer certifications. I set a goal that 10% of my organization have to be certified. We didn’t mandate it, we just encouraged learning. This was a great way to incentivize people to get certification as they form study groups.

We also had some engineering events. I was lucky enough to organize Serveless Days Belfast in January. And then we had a couple of AWS Summits, which was super. This really helped build a sense of excitement for AWS for our engineers. We also run hands-on sessions and CDK Workshops. We’ve co created more CDK partners. As we learned, it’s early days for CDK. So our people love the cognates nature of that.

Engineering standards

And finally, we have clear engineering standards. We talk about expectations or goals. I personally view many of our teams to make sure we’re on track. This is a great way to build in a feedback cycle, and to ensure that we can adapt quickly.

Over the past 12 months, we’ve really started to scale. We find that Well-Architected has been a game changer. I believe this is the secret sauce of cloud adoption. The five pillars have been very effective, in driving the right conversations in our teams, operational excellence, security, reliability, performance efficiency, and cost optimization. Specifically, surplus lines have been an extra layer for those five pillars. The differentiator here is how the process is raw. We have our engineers perform Well-Architected reviews in their own teams as a collaborative exercise.

We’ll also use AWS if there’s something really big or challenging, but we try to run this process ourselves. This also gives my team a really tight feedback loop, in which we can spot signs early and work on the improvements for the engineers.

What’s the business value of Serverless and Well Architeced?

With all of that said, what’s the business value of Serverless and Well-Architected?

What happens in systems when we have unexpected demands. Since lockdown started, we have observed some different traffic patterns. Insurance claims are constantly our highest priority. We want to help customers quickly. When our claims system that was built for high traffic suddenly dropped. (That’s the blue line here). The queries for claim status drastically reduced. This has rarely happened. But the system shrank in response. The second system, the orange line here, deals with outgoing payments. Traditionally, it’s a low volume system that is slow and steady. But press came out that insurance companies would refund some policy payments, so the team had to push an unprecedented amount of traffic through this system on short notice. The system grew rapidly in response. Both are Serverless Well-Architected. All the teams needed to do was triple check a few limits. There was no big rush, the systems responded appropriately.

Serveless edge is where we want to be. It’s the cutting edge of technology. We will stay ahead of the game at the edge of technology.

The first step is mapping out your journey

If I was starting today, the first step would be to map out your journey. The ‘code is liability’ mindset is crucial for engineers. You don’t need to build everything.

The ambassador role is important to ensure the engineering teams and the executive teams are connected for where they need to get to.

Most importantly, you need to speak the language of the business. My business partners don’t want to hear about Lambda, Kinesis, Step Functions. They want something fast, valuable, and low cost.

Finally, something to take away. Even our marketing department are shepherds. They told me the slogan is about customization of insurance policies. But I think it might be about ‘compute’. You only pay for what you need. It’s a serverless message.

So that’s the theory and practice of enabling a Serverless First Cloud Center of Excellence.

A big thank you, from Jessica and I for tuning in to our session today. We selected a few other sessions that we hope we’ll help you on your journey. The first two offer a fantastic narrative on how this change works and the mindset you need to develop. The four sessions at the bottom, speak to good practice and some key serverless and generic practices. I thoroughly recommend these.

So thanks again for tuning in. And please remember to complete the session survey.

We value your feedback. So again, thank you, from Jessica and I.

How to enable a serverless first cloud centre of excellence

Introducing Jessica Feng