Modal Title
AI / Serverless / WebAssembly

IBM’s Quiet Approach to AI, Wasm and Serverless

The New Stack sat down with IBM Cloud CTO Jason McGee to discuss IBM’s perspective on serverless, Wasm, and, of course, what’s next for AI.
May 4th, 2023 6:00am by
Featued image for: IBM’s Quiet Approach to AI, Wasm and Serverless

It’s been 12 years since IBM’s Watson took on Jeopardy champions and handily won. Since then, the celebrity of Watson has been usurped by ChatGPT, but not because IBM has abandoned Watson or artificial intelligence. In fact, the company’s approach to artificial intelligence has evolved over the years and now reflects a different, more targeted path forward for AI — beyond pumping out generic large language models.

I sat down with IBM Fellow and CTO of IBM Cloud Jason McGee during KubeCon+CloudNativeCon EU, to discuss how Big Blue is approaching modern challenges such as serverless, WebAssembly in the enterprise, and of course artificial intelligence. The conversation has been edited for clarity and brevity.

Using AI for Code Automation

What is IBM doing with automation?

There [are] a lot of dimensions to automation. At the base technology level, we obviously do a lot of work with Ansible and the Red Hat side, and then we use TerraForm pretty extensively as a kind of infrastructure as code language for provisioning cloud resources and managing a lot of those reference architectures — under the covers are essentially collections of TerraForm automation that [are] configured [in] the cloud. There is also higher level work going on in automation, and that’s more like business process automation and robotic process automation, and things like that. With products like Watson Automate, [we] are applying AI and automation to customers, business processes and automating manual things. So that’s kind of higher up the stack.

We have tools [like robotic process automation and business process management] in our space, and we’re applying AI to that and then down the technology stack. We have software automation tools like TerraForm and Ansible that we’re using. We’re doing some interesting work on Ansible or the research team, with applying foundation models to help code assist on Ansible and helping people write automation using AI, to help fill in best practice code based on natural language descriptions and stuff.

What does the AI do in that context?

Think about if you’re writing an Ansible playbook, you might have a block that’s, “I want to deploy a web application on Node.js” or something. You could just like write a comment, “Create a Node.js server. running on port 80” in natural language, and it would read that comment and automatically fill in all of the code and all the Ansible commands, to provision and configure that using best practices. It’s been trained on all the Ansible Galaxy playbooks and GitHub Ansible code. So it’s like helping them write all the Ansible and write good Ansible […] based on natural descriptions of what they’re trying to achieve.

The AI is based on large language models. Do they hallucinate? I keep hearing they hallucinate and I’m reminded of the story, “Do Androids Dream of Electric Sheep?

A great question and it’s part of the example I gave you of that model [which] was trained for a more narrow purpose of doing Ansible code assist, versus something like GPT, which was like trained on everything and therefore it can be more accurate at the smaller scope, right? It understands natural language but also understands Ansible very precisely, and so it can have a higher accuracy than a general purpose large language model, which also could spit out Ansible or TerraForm, or Java or whatever the heck you wanted it to, but maybe has less awareness of how good or accurate language is.

We’re using it in AI Ops as well for incident management, availability management and property termination. That’s another kind of big space that IBM is investing a lot in — Instana, which is one of our key observability tools.

How do we help customers adopt and leverage large-scale foundations with large language models? In IBM Cloud we have this thing called the Vela cluster, which is a high-performance foundation model training cluster that’s in our cloud in Washington, DC. It was originally built for our research team so that the IBM Research Group could use it to do all their research and training on models and build things like Project Wisdom on it.

Now we’re starting to expose that for customers. We believe that enterprises will build some of their own large language models or take base models — because we’re also building a bunch of base models — and then customize them by training them on additional unique data. We’re doing work in OpenShift, to allow you to use OpenShift as the platform for that. We’re doing work in open source around that software stack for building models. And then we’re of course building a whole bunch of models.

Beyond Traditional Serverless

TNS: What else are you here promoting today at KubeCon?

McGee: There’s a lot of activity in this space that we’ve been working for a long time on, so it’s more progression. One is serverless and we have a capability called IBM Cloud Code Engine and that’s based on K data, which is like a layer on top of Kubernetes, designed to help developers consume cloud native. We’ve been doing a lot of work recently expanding that serverless notion to a more varied set of workloads.

Traditional serverless was like apps and functions running event-driven kinds of workloads — a lot of limitations on what kinds of applications you could run there. What we’ve been doing is extending that and opening up the kinds of workloads you can run, so we’re adding in things like batch processing, large-scale parallel computation, compute-intensive, simulation kind of workloads. We’re starting to do some work on HPC [high-performance computing] so people can do financial modeling or EDA [exploratory data analysis], industrial design and silicon design workloads, leveraging a serverless paradigm. We have a lot of activity going in that space.

We’re also working with a project called Ray, which is a distributed computing framework that’s being used for a lot of AI and data analytics workloads. We’ve enabled Ray to work with the Code Engine so that you can do large-scale bursts [of] compute on cloud and use it to do data analytics processing. We’ve also built a serverless Spark capability, which is another data analytics framework. All of those things are exposed in a single service in Code Engine. So instead of having seven or eight different cloud services that do all these different kinds of workloads, we have a model where we can do all that in one logical service.

What kinds of use cases are you seeing from your customers with serverless?

One of the challenges with serverless is [that] when it started a few years ago, with cloud functions and Lambda, it was positioned in a very narrow kind of way — like it was good for event-driven, it was good for kind of web frontends.

That’s interesting, but customers actually get a lot more value out of these more large-scale, compute-intensive workloads. Especially in cloud, you’d have this massive pool of resources. How do you quickly use that massive pool of resources to run a Monte Carlo simulation or to run a batch job or to run an iteration of design verification for a silicon device you’re building? When you have those large-scale workloads, the traditional way you would do that is you would build a big compute grid, and then you have a lot of costs sunk in all this infrastructure.

We’re starting to see them use serverless as the paradigm for how they run these more compute-intensive, large-scale workloads, because that combines a nice set of attributes, like the resource pool of cloud, with [a] pay-as-you-go pricing model, with a no infrastructure management. You just like simply spin up and spin back down as you run your work. So that’s the angle on serverless we’re seeing a lot more adoption on.

Wasm’s Potential

Are people using serverless on the edge?

They do. It’s more niche, of course. But you see, for example, in CDN (content delivery network), where people want to push small-scale computation out to the edge of the network, close to the end users — so I think there [are] use cases like that. At IBM Cloud, we use Cloudflare as kind of our core internet service, [with] global load balancer and edge CDN, and they support our cloud functions. You see technology like Wasm — just a lot of people here talking about Wasm. Wasm has a role to play in those scenarios.

Is IBM doing anything with Wasm? Is it useful in the enterprise?

We’re enabling some of that, we’re looking at it in the edge. We support Wasm Code Engine, it gives you a nice, super fast startup time, like workload implication in 10 milliseconds or something, because I can inject it straight in with Wasm, which is useful if you’re doing large-scale bursty things but you don’t want to pay the penalty of waiting for things to spin up.

But I still think that whole space is more exploratory. It’s not like there [are] massive piles of enterprise workloads waiting to run on Wasm, right? So it’s more next-gen edge device stuff. It’s useful — there [are] some interesting use cases around that HPC [high-performance computing] space potentially … because I can inject small fragments of code into an existing grid, but I also think it’s it’s a little more niche, specialist workloads.

CNCF paid for travel and accommodations for The New Stack to attend the KubeCon+CloudNativeConEurope 2023 conference.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Pragma, The New Stack, Tigera.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.