DevOps Overview, News, Trends and Analysis | The New Stack

Kubernetes Operators: The Real Reason Your Boss Is Smiling

Ryan Wallner — Wed, 14 Jun 2023 13:30:45 +0000

It’s no industry secret that the cloud native segment around Kubernetes has shifted toward hosted Kubernetes providers who build, run and partially manage the Kubernetes infrastructure for organizations. Compared to organizations building and maintaining their own Kubernetes infrastructure, hosted Kubernetes providers allow you to offload a measurable amount of technical complexity so staff can focus on operations and innovation.

Along with the rise of hosted Kubernetes providers, more enterprises are favoring larger Kubernetes distributions from the likes of OpenShift, Rancher, Azure Kubernetes Service (AKS), Amazon Elastic Kubernetes Service (EKS) and others rather than building their own homegrown distribution from the upstream codebase.

These trends are not limited to the Kubernetes platform itself but showcase a general movement toward letting the providers of strong core platform layers do what they do best so companies can focus on the business value that comes from building on top of Kubernetes. This was a chant heard in 2017 to “make Kubernetes boring,” and I think that we are getting there as an ecosystem.

But that was six years ago. What does “boring” look like in 2023 and how do new trends like the rise of Kubernetes operators fit into this picture? There are three ways I think of this when evaluating modern Kubernetes deployments:

I want my organization to build value on top of Kubernetes.

Similar to the mantra of 2017, the “value” we mean here is everything that is built on top on Kubernetes and the infrastructure layers, which has seen substantial progress and evolution from the community over the past six years.

I want Kubernetes to be simple.

Every organization is unique, and roles within your organization may differ depending on not only size, but also Kubernetes maturity. Because of this, skill sets vary, and not everyone has the time or ambition to become an expert. Those who aren’t experts want Kubernetes to be easy so daily tasks aren’t intrusive.

I want Kubernetes to be scalable.

Deployment models for Kubernetes are expanding, and enterprises are taking advantage of using Kubernetes across on-premises, multicloud and hybrid cloud environments. Kubernetes needs to be flexible across these environments while also enabling cluster growth with streamlined scalability as the practice matures.

Building Value on Top of Kubernetes

Once the Kubernetes infrastructure layers are solid for your organization, it’s time to build the “value” on top, whether that is an application end users interact with or a platform layer that adds advanced data services such as observability. Developers need to start somewhere, and this usually consists of finding the right Kubernetes resources for the workload, such as creating deployments, services, jobs, statefulsets, daemonsets, persistent volumes, pod security policies, role-based access control (RBAC) rules, secrets, service accounts and much more.

Managing and tracking all these resources can get quite complicated, and it’s likely that your team doesn’t need to control all these objects but must adhere to resources that affect how applications run. There are cases where this development practice is something that must happen: For instance, if the application you are building is unique to your organization, then the API resources prevent you from having to start from scratch.

However, on the flip side, we see DevOps teams, developers and application owners turning to trusted, prebuilt Kubernetes operators to run, configure and manage common applications so they can focus on the value above these layers.

Operators: Bringing Together Value, Simplicity and Scalability

If you’re not familiar with what a Kubernetes operator is, then I suggest reading the documentation.

Switchboard operator

However, whenever I hear the term “operator,” my mind immediately jumps to a switchboard operator with a massive telephone network in front of them moving wires in and out at a rapid pace while transferring calls.

You may remember them from the pilot of the hit show “Mad Men” or recall the popular saying, “Operator, please hold.”

Much like the way a switchboard operator in the 20th century assisted in the routing and transfer of phone calls, a Kubernetes operator facilitates the deployment, management and ongoing operations of a Kubernetes application. Except instead of having a person move wires behind a telephone switchboard, think of it as a robot who is listening to the inputs and commands and outputting the Kubernetes resources in the appropriate namespaces.

It’s Like a Robot, but without the Attitude

Unlike the switchboard operator, the core tenet of a Kubernetes operator is automation. Automation is a necessity as the community forges ahead with Kubernetes, allowing end users to focus on what matters to them while relying on operators to automate deployments, operations and management of common components in their stack.

There is a community propensity to use trusted operators for applications and not reinvent the wheel when running a particular service on Kubernetes. Take the database landscape’s use of operators as an example.

As seen at KubeCon EU in Amsterdam, the operator pattern has a strong use case for databases because in general; they are a common denominator to many application stacks. Applications may use Postgres or Redis in slightly different ways, but they are common services that need to be installed, configured and managed. Databases on Kubernetes deployed via operator in a trusted way for production is a major win for time-to-value when it comes to DevOps development cycles.

It doesn’t stop at databases, though; operators can be used for all kinds of applications. Operators can be used for almost anything from monitoring and alerting software, to storage integrations, to fully customized applications that may be delivered to internal customers.

It’s great to see the focus move northbound as the Kubernetes ecosystem matures. As end users and organizations are gravitating to hosted Kubernetes and application automation through operators, I’m excited to see the innovations that come next focus on what can be built on top of Kubernetes.

How Do We Use Operators?

Operator frameworks are extremely popular among Dell’s customers, and we are actively working to introduce deeper operator capabilities for our Kubernetes storage capabilities, such as our container storage modules, as well as container storage interface drivers, which are available on OperatorHub.io. Operators are also a key part of our future portfolio offerings and will be integrated into our upcoming user interface for Kubernetes data storage.

The benefits of using operators are straightforward: less time spent on manual processes, more time spent on coding and innovation. If you haven’t started with operators today in your business, I highly suggest exploring the world of Kubernetes operators and seeing how to take advantage of automation to make your life a little easier.

Simple, scalable and adding value on top of Kubernetes.

The post Kubernetes Operators: The Real Reason Your Boss Is Smiling appeared first on The New Stack.

At PlatformCon: For Realtor.com, Success Is Driven by Stories

Jennifer Riggins — Tue, 13 Jun 2023 16:31:53 +0000

You’re only as good as the stories you tell. Storytelling, after all, is a tenet of humanity, and the best way to pass information, at least when it’s anchored in context. It’s also a pillar of successful sales. No matter what you’re selling or who you’re selling it to.

For platform engineering, your eager or not-so-eager audience is made up of your colleagues, the internal developers as well as other company-wide stakeholders and influencers. You have to understand the context and needs of your different target personas, and how they could respond to the changes you’re making. Much of intentional developer experience and platform adoption hinges on your ability to convey what works and what hasn’t, often socratically repeating back to be sure you comprehend your stakeholders’ stakes — and making sure they feel heard.

For Realtor.com, a platform engineering mindset is anchored in the power of success stories. Suzy Julius, SVP of product and engineering, joined the virtual PlatformCon stage to share how the top U.S. real estate site, with 86 million visits per month, went from a culture where you couldn’t say platform to a culture that embraces it.

The First Step Is always Recognition

Realtor.com is a company that’s over the last couple of years scaled mainly via acquisition, which often results in either spaghetti architecture or a complete lack of visibility into other business units. It pretty much always signals an increase in complexity.

“Our tech stack became extremely complex, slowing down our ability to build features in a fast and reliable way,” Julius said. “The existing tech stack made it difficult to ensure a quality product or ensure reliable feature releases.”

Facing its divergent and often duplicated tech ecosystem, in 2020, the company embarked on a transformation, with the aim to “simplify to scale” in order to accelerate innovation.

A platform emerged as the solution.

When Julius joined the company at the start of 2021, her team recognized the common barriers to entry to platform adoption, mainly, “knowing that there was a reluctance to building a platform, with fear that one would slow down the engineering team by creating more complexity.” Not an uncommon hurdle for platform engineers to face at all.

So the platform team kicked off this journey gaining feedback from a diverse background of stakeholders, not just from engineering, but from business and security, and offered a compelling success story, she explained. Now, 150 people are considered part of the platform organization — a mix of product leaders and engineers, who she said are all “focused on developer experience, data, content and personalization.”

Next, It’s Time to Adopt a Product Mindset

Come 2022 and the platform team was embracing a platform mindset, concentrating on developer enablement and providing a service to their colleagues. Specifically, Julius outlined the aims as:

To provide service to others to help everyone go faster and more reliably.
To understand as a platform team the vision and principles, and then to get corporate buy-in.
To be able to show short-term and long-term wins.
To measure, iterate and evangelize the vision to be a platform empowering all products and unlocking business opportunities.

These goals, she said, mostly focused on developer experience, but they also created a data platform for a “clear line of sight to understand business metrics or give analytics the ability to create a canonical source of truth dataset for our consumer and customers.”

The tech stack that drove this sociotechnical change included:

For developer experience — CircleCI, Apollo supergraph, GraphQL, Amazon EKS. ArgoCD, Tyk API gateway, Vault developer portal
For data, content and personalization — Fivetran automated data movement platform, Snowflake for data integration, Apache Kafka, DBT for data warehousing, Apache Airflow, NodeJS, Amazon SageMaker for machine learning, Optimizely, Metaflow data science framework, ElasticSearch

All the platform tech, people and processes are aligned around the vision to become the preferred platform on which their internal customers choose to build. That is grounded, Julius explained, in connecting wins with features that drive business metrics, namely, revenue and/or user engagement.

She highlighted sociotechnical lessons they learned over the past year:

A platform mindset is not just a technical but a cultural shift.
Adoption hinges on training, documentation and awareness.
You need a tighter feedback loop to establish stakeholder sentiment.
Be aware not to over-index on microservices. For example, they had rate-limiting in different locations, which Julius said made it hard to build client features.
Align around a few programming languages, as too many make it much harder to build cross-company platform features like logging and monitoring.
And, in a time of tighter budgets, make sure you commit to continuously invest in your platform service, no matter what.

Keep up the Momentum

Now, this year at Realtor.com is all about embracing the Platform as a Product mindset and building a differentiated, self-service product suite. Treating your platform as a product is about treating your developers like your customers, always focusing on improving developer experience or DevEx. For Realtor.com, this includes continuous feedback and stakeholder scorecards.

This year is about “understanding that we need to continue to solve problems and to make it easy and intuitive to use our platform,” Julius said. “And we need to realize gains beyond tech, [like] more involvement and input into what the platforms do and how they can help the entire company.”

Many of the platform engineering thought leaders The New Stack has interviewed have talked about the importance of using the platform as a single pane of glass to create a common language between business and engineering. This helps business understand the value of the big cost center that is engineering, while engineering can better connect their work to driving real business value to end customers. Julius’s team stands out in looking to leverage the platform to measure that effect. She said they are currently working “to incorporate how platforms impact our end-user strategy and experience,” connecting the DevEx to the DevOps.

They are also working out how to evangelize the platform internally. Like with all things, communication is key, including around onboarding and design-first thinking. They are customizing their messaging for different stakeholders. Julius noted they all have to get comfortable repeating themselves to not get lost in the email and Slack cacophony. The platform team is also considering adopting a tool like Backstage to help facilitate that internal product marketing and to, as she said, “bring it all together.”

All this feeds into a continued highlighting of performance, security and reliability gains.

How Mature Is Your Platform?

Platform teams are cost centers, but, until recently, developer productivity wasn’t something that could be easily measured. This means that platform teams have had difficulty assessing their performance and impact. Last month, a new DevEx framework came out that examines developers’ flow state, feedback loops, and cognitive load.

The month before, the Syntasso team open-sourced their Platform Maturity Model which guides teams to answering the following questions:

How does the company value (and therefore fund) platform efforts?
What compels users to start, and be successful, using your platform?
How do users interact with and consume offerings on your platform?
How are requests and requirements identified and prioritized on your platform?
How does product engineering manage non-differentiating (and often internally common) tooling and infrastructure?
How does each business requirement (e.g. compliance or performance) get enabled by platform offerings?

Each of these questions has answers from Levels 1 through 4 to mark maturity of a platform team.

The Realtor.com platform team has created what it refers to as a playbook — an artifact that helps continuously build onto the organization’s Platform-as-a-Product culture. This includes their own maturity model. “It’s recognizing and reminding us that we don’t want to stop at a platform that just works, but we want to be seen for the good and invested in,” Julius said.

Pulling a metaphor for the company’s core industry, she compared a platform to a house. There are parts that you don’t really notice until something goes wrong like a window won’t open or the foundation is cracked. She explained that “Where we strive to mature as a platform when you notice the doors, you notice the windows, and they’re seen for the good.”

Next, the playbook features two decision-making frameworks to decide when to slow down or to speed up. She called them a flywheel to show off how they make decisions collaboratively and cross-functionally, “in a way that we can keep coming back and pointing at that decision as we progress.” They are:

Strategic technical initiative group (STIG) — to ensure technical decisions are made collaboratively and consider the future tech stack and feature development.
Cross-functional workshops — to collaborate and focus on both the Platform-as-a-Product and tech strategy.

Finally, the playbook centers on identity — which Julius said she could’ve given a whole talk around, it’s that essential to the Realtor.com product team. Identity leans into the importance of vision and purpose. A platform team always needs empathy, she argues, putting itself in its stakeholders’ shoes to better understand the technology and onboarding. It’s treating internal customers with the same level of care as external users.

Identity is all about understanding what a success story looks like and working backward to identify key aspects of that story, Julius explained, aligning that story with key decisions and remaining focused on the vision. It’s always about maintaining the organization’s reputation and grounding every decision in context.

“This is all about having the end state in mind, combining the fundamentals with your vision. It’s that compelling story of success.”

The post At PlatformCon: For Realtor.com, Success Is Driven by Stories appeared first on The New Stack.

How DevSecOps Teams Should Approach API Security

Gary Archer — Mon, 12 Jun 2023 13:30:17 +0000

Software organizations need to store data and expose it over the internet to user-based applications. The standard way to manage this is to host APIs. In the following diagram, API endpoints are called by both web and mobile clients. APIs are usually also broken down into manageable codebases, sometimes called microservices, which then call each other.

These components are likely familiar to anyone working in software, from business owners to developers, DevOps and compliance staff. Yet in my experience, it is common for these roles to lack a unified vision on how they approach API security. Therefore, in this post, I will provide a recommended API security setup that benefits all parties involved.

Token-Based Architectures

To secure APIs, you must send a message credential with every API request. The most secure way to protect your data is to design this credential with minimal privileges, based on end-to-end client and API flows. The credential must be unforgeable and sendable from any type of client. The JSON Web Token (JWT) format meets these requirements.

The following example shows one possible financial use case: An app sends a token to an API, which forwards it to other APIs. The token restricts access to a particular user and payment transaction. The token is locked down in business terms and is, therefore, more secure than an infrastructure-based credential, such as a client certificate:

The ability to lock down access according to business rules is the primary security behavior of the OAuth 2.0 authorization framework. On every request, the API must cryptographically verify a JWT access token, after which the API can trust the values contained within the token, which are called claims. In this example, APIs could use the received transaction_id claim to restrict access to the single payment transaction. More commonly, APIs filter resources according to business rules based on the user’s identity.

Access tokens issued to each client can be designed differently based on that client’s end-to-end API flows. APIs use token sharing to forward access tokens to each other so that the user identity and claims flow securely. Each API then implements authorization using the received claims. This is more secure than solutions that receive a user ID in an encrypted cookie, where the API always allows the user’s full privileges.

Using tokens in this way provides a zero trust API architecture, where APIs do not trust each other. Instead, they only trust the token issuer, which in OAuth 2.0 is the authorization server. A third party provides this specialist component, and using one should give the richest security capabilities for applications and APIs.

This article’s main focus is DevSecOps, so next I will discuss how this API architecture affects security-related roles within an organization. The main behaviors, and the benefits of a token-based architecture, are most apparent once the essential requirements from all DevSecOps stakeholders are understood.

Development Teams

When using OAuth 2.0, frontend developers don’t have to deal with the complexity of user authentication. Instead, logins are implemented using a code flow. This involves redirecting the user to authenticate via the authorization server, which provides many ways to authenticate users. The party providing the authorization server should continually add support for the most cutting-edge authentication options. However, frontend developers need to understand the moving parts, including OAuth 2.0 messages, expiry events and error conditions. Therefore they must learn some OAuth troubleshooting skills.

Meanwhile, both developers and testers need productive ways to get user-level access tokens for test users so that they can send sample API requests. There are various options here, such as using online tools to run a code flow or using mock OAuth infrastructure. The end result should be a productive setup where APIs can easily be supplied with an access token, and then be validated using a token-signing public key downloaded from the authorization server:

Security Teams

Security and compliance teams have their own requirements, which are typically captured by architects when designing API solutions. These span areas like API hosting, browser security, managing personal data, auditing and regulations. The authorization server provides ways to externalize some of these difficult requirements, such as privacy and consent, from applications and APIs. Security teams should also have an awareness of OAuth 2.0 secure development best practices for APIs and clients.

The security team should first ensure that the token-based architecture meets confidentiality requirements. Access tokens delivered to APIs should use the JSON Web Token (JWT) format, yet since these are easily readable, they should not be returned to internet clients. To ensure token confidentiality, the preferred option is to use the phantom token pattern. This involves clients receiving opaque access tokens, which reveal no sensitive data. When APIs are called, an API gateway can introspect the token and forward a JWT to APIs. The end-to-end flow does not add any complexity to API code or require APIs to manage their own crypto keys:

Some organizations use an entitlement management system, such as Open Policy Agent, to centralize authorization. Doing so gives the security team the best visibility into which parties access important business resources. APIs using a token-based architecture integrate well with such systems, since the access token serves as a policy information point (PIP) that can be sent to a policy decision point (PDP), either from the API gateway or the API itself:

DevOps Teams

In an OAuth 2.0 architecture, APIs and user applications outsource all of the low-level security, including key management and user credentials, to the authorization server. Over time, this component, therefore, includes many intricate security settings. The DevOps team is most often responsible for maintaining its high availability and correct production configurations.

The authorization server should be considered a specialist API, hosted right next to the organization’s APIs. Doing so provides the best performance and allows control over which endpoints are exposed to the internet. DevOps teams should also understand how to de-risk authorization server deployment and upgrades. They should use an OAuth 2.0 parameterized configuration created only once, after which the same binaries and configuration are simply promoted down a pipeline.

Once the token-based architecture is live, DevOps teams need a productive way to manage Day 2 operations for both APIs and the authorization server. This should include dashboard integration, auto-healing, auto-scaling, alerts and useful technical support logs.

DevOps teams often implement security jobs related to the API gateway. An example might be implementing intelligent routing of API requests, such as to the user’s home region, to meet data sovereignty restrictions identified by the security team. The following diagram shows an American user being re-routed to the correct region, based on a claim in the access token, to ensure that the user’s transactions are stored in the United States:

Conclusion

Implemented correctly, an OAuth 2.0 token-based architecture provides a complete zero trust solution for APIs. The best solutions require cross-team collaboration to meet the crucial requirements of all DevSecOps roles. Business owners can then deliver digital services with future-facing security. This solution should offer multiple user authentication methods and first-class interoperability with external systems.

Since OAuth 2.0 requires a distributed architecture, teams often must learn new best practices and put in place productive technical setups. Developers can start by following solid standards-based learning resources like the Curity Guides. The security components you choose are also important. Use an API gateway with good support for the intelligent processing of API requests. Also, verify early that the proposed authorization server has up-to-date support for standards and is extensible. This will enable you to deliver the right claims to APIs and customize user authentication when required.

This article has summarized the core setup needed to implement a modern token-based architecture. Once the correct separation is in place, you can meet all of the main requirements for all DevSecOps roles. The architecture will then scale to many components and other security use cases. The following links provide further related details:

The post How DevSecOps Teams Should Approach API Security appeared first on The New Stack.

GitLab All in on AI: CEO Predicts Increased Demand for Coders

Loraine Lawson — Fri, 09 Jun 2023 15:26:19 +0000

GitLab is all in on AI, with CEO and co-founder Sid Sijbrandij calling it “one of the most exciting technology developments of our time” and making an unusual prediction that it will create demand for more programmers.

“AI represents a major shift for our industry. It fundamentally changes the way that software is developed,” Sijbrandij said on GitLab’s earnings call Monday. “We believe it will accelerate our ability to help organizations make software, faster. I am excited about this next wave of technology innovation.”

GitLab plans to incorporate AI at all levels of its DevSecOps platform, he added.

“We believe that an AI-powered platform focused solely on the Developer persona is incomplete. It is missing essential Security, Operations, and Enterprise functionality,” Sijbrandij said. “Remember: developers spend only a small fraction of their time developing code. The real promise of AI extends far beyond code creation.”

During the first quarter of the year, GitLab delivered five new AI features, followed by five more in May with the release of GitLab 16 — including a beta of Code Suggestion, as well as security testing and analysis, observability and proactive vulnerability detection. Additional AI-powered features available include Suggested Reviewers for code review, Explain This Vulnerability for vulnerability remediation, and Value Stream Forecasting for predicting future team efficiency. Code Suggestions does what its name implies, making code suggestions to developers as they type.

“We’re proud to have 10 AI features available to customers today, almost three times more than the competition,” he said, adding that applying AI to a single data store, for the full software development life cycle, also creates compelling business outcomes and is something he believes can be done with GitLab.

GitLab continues to iterate on Code Suggestions and expects to make it generally available later this year. The company has also boosted language support from six languages to 13, so more developers can use it, he added.

“Code Suggestions is uniquely built with privacy first as a critical foundation,” he said. “Our customers’ proprietary source code never leaves GitLab’s cloud infrastructure. This means that their source code stays secure. In addition, model output is not stored and not used as training data.”

AI Support for Development Teams

Also later this year, the company plans to introduce an AI add-on focused on supporting development teams, which will include Code Suggestions functionality, across all GitLab’s tiers at an anticipated price point of $9 per user per month, billed annually, he said.

He noted that they’d had many conversations with senior-level customers, but one comment from the CTO of a top European bank stood out.

“When the conversation moved into AI, the CTO said something extremely interesting. He said: Code generation is only one aspect of the development cycle. If we only optimize code generation, everything else downstream from the development team — including QA, security, and operations — breaks. It breaks because these other teams involved in software development can’t keep up,” Sijbrandij said. “This point — incorporating AI throughout the software development lifecycle — is at the core of our AI strategy.”

Companies Reevaluating Strategies in the Wake of AI

Customers are also reevaluating their own software supply chain through the AI lens, he added. Additionally, chief information security officers are also engaging with AI, and applying governance, security, compliance and audit-ability to it.

He predicted that AI will increase GitLab’s market for three reasons. First, AI will make writing software easier, which in turn will expand the audience of people — including junior and citizen developers — who build software. Second, as developers become more productive, software will become less expensive to create, which will fuel demand for more software and require more developers to meet the additional need. Third, the company expects more customers will turn to its solutions as they build machine learning models and AI into their applications.

“As we add ModelOps capabilities to our DevSecOps platform, this will invite data science teams as new personas, and will allow these teams to work alongside their Dev, Sec, and Ops counterparts,” he said. “We see ModelOps as a big opportunity for GitLab.”

Sijbrandij also shared how global security and aerospace company Lockheed Martin used GitLab to reduce their toolchain and complexity while reducing costs. The Lockheed Martin team has reported 80x faster CI pipeline builds and 90% less time spent on system maintenance, he said, adding they’ve also retired thousands of Jenkins servers. They’ve also moved from monthly or weekly deliveries to daily or multiple daily deliveries.

The post GitLab All in on AI: CEO Predicts Increased Demand for Coders appeared first on The New Stack.

Unlocking DevSecOps’ Potential Challenges, Successes, Future

Steven J. Vaughan-Nichols — Fri, 09 Jun 2023 14:40:20 +0000

It has been more than 15 years since DevOps emerged on the technology landscape, promising to revolutionize team collaboration and streamline development processes. While some people now say Platform Engineering is the one true way forward, DevOps scope widened to include security, giving rise to DevSecOps, which remains influential. Unfortunately, even as the need for coding and operational security grows, a Progress Software study has found that many organizations have struggled to implement DevSecOps.

To find out why, Progress interviewed 606 IT/Security/App Dev and DevOps decision-makers from organizations with over 500 employees across 11 countries. The survey’s goals were to identify what was hindering DevSecOps success and to uncover best practices from companies with thriving DevSecOps programs.

The Challenges

They found:

DevSecOps success has been hindered by complexity and constant change.
Effective DevSecOps requires collaboration and investment in culture.
The desire to succeed in DevSecOps did not guarantee mastery of its practices.

These DevSecOps challenges included complexity, competing priorities, and a lack of clear business impact and Return on Investment (ROI). Additionally, while the participants recognized the potential benefits of adopting cloud native technology, AI, and Policy as Code in their DevSecOps strategy, they had trouble demonstrating the ROI for these investments. That, of course, made it difficult to secure buy-in from stakeholders.

In addition, despite security threats being the primary driver for DevOpsSec evolution, many respondents proved only somewhat familiar with how security fits in DevSecOps. In short, they didn’t really understand the techniques they were trying to use. Specifically, they had trouble prioritizing security efforts, securing different types of workloads, and meeting delivery deadlines and audit requirements.

While everyone agreed that collaboration and culture were critical factors for successfully implementing DevSecOps, only 30% of the respondents felt confident in the level of collaboration between security and development teams. Furthermore, 71% agreed that culture was the biggest barrier to DevSecOps progress, yet only 16% prioritized culture as an area for optimization in the next 12-18 months. This discrepancy underscored the need for fostering a collaborative culture within organizations.

Addressing the Challenges

Therefore, to fully harness the potential of DevSecOps, organizations must address several key challenges. These are:

Overcome obstacles to collaboration: Encourage cross-functional communication and collaboration between security, app development, and other teams.
Incorporate new technologies and processes: Balance modernizing technology, processes, and culture, as focusing on just one area will not be enough.
Address conflicting interests: Ensure leadership prioritizes and invests in key areas that drive DevSecOps success, including adopting a holistic approach that engages teams from across the organization.
Build confidence in securing cloud native adoption: Focus on implementing and leveraging the benefits of cloud-first technologies while considering cloud security.

It’s become clear that even though we’ve been using DevOps for years, many of us still haven’t mastered creating an effective DevSecOps culture. Companies must engage in honest conversations from the executive level down about where they are in their journey and how to move forward to success.

The post Unlocking DevSecOps’ Potential Challenges, Successes, Future appeared first on The New Stack.

‘Running Service’ Blueprint for a Kubernetes Developer Portal

Zohar Einy — Wed, 07 Jun 2023 15:30:41 +0000

Internal developer portals exist to provide developers with a product-like experience that’s free of cognitive load, allowing developers to stay in the flow and be productive. They are set up by platform engineering teams to help developers serve themselves within guardrails and internal quality standards.

With portals, developers can simply and easily set up an ephemeral environment, restart a Kubernetes cluster, redeploy an image tag or scaffold a microservice. Platform engineering will make those actions reusable in the platform, and the internal developer portal will act as the interface to the platform and then reflect the changes in the software catalog.

But internal developer portals are more than loosely coupled product-like user interfaces that make developer lives easier. The internal developer portal also has a valuable software catalog that includes everything application-related in your engineering, from CI/CD metadata through cloud resources, Kubernetes, services and more.

The value of the software catalog is much greater than the metadata it contains (which is pretty neat, too) and goes way beyond showing who owns a service or where its logs are. In addition to being a single source of truth, its value comes from the way it provides context, especially in case of runtime data. It can quickly answer questions such as, “What is the current running version of service x in environment y?” even in cases that contain feature flags, canary or blue/green deployments.

Context and runtime data are the focus of this article. We will provide a detailed example of an internal developer portal for Kubernetes objects. We will then show the power of the software catalog and the fact that it can support workflow automation — anything from time to live (TTL) termination through service locking, triggering automated actions when services degrade, etc. — as a result of its combination of metadata and runtime data.

Spotify’s Backstage C4 Model for Internal Developer Portals

Software catalogs need a data model, and before you begin, you need to define it. It’s nothing too complex, but you do need a schema identifying what needs to be inside your software catalog. Software catalogs need to be unopinionated and completely flexible, so the best option is to let you define the data model yourself.

In Port, the schema for a type of entity (let’s say a K8s cluster) is called Blueprint. The actual entity (the actual cluster in this case) is called an entity. In Spotify’s backstage, the Blueprint is called “kind.”

Backstage, a leading open source internal developer portal and the third most popular Cloud Native Computing Foundation (CNCF) project, recommends beginning with a certain data model consisting of six blueprints (or kinds).

Component
API
Resource
System
Domain
Group

As Spotify’s senior engineer Renato Kalman and staff engineer Johan Wallin explain here, in designing Backstage they had a software visualization challenge: They needed a “standardized software metadata model to create a common language for communicating software architecture.” What they came up with was the C4 model. You can see an example of a Backstage C4 model here.

But this data model misses one point: the “running service” blueprint.

What Is a Running Service?

Your code is not your app. The code that lives in your repo or in a container image isn’t the app. In real life, your app exists on an environment and serves something (api/other services/users) within an ecosystem of tools and dependencies. It behaves differently depending on where it is.

The running-service blueprint, or as we sometimes call it “service in environment,” reflects the fact that the a single “service” is usually deployed to many different environments. Services can live in a variety of environments, staging, development, production. Services can also live in many different customer environments, especially in the case of single-tenant architectures.

This simple fact that the service lives in many different environments is reflected by the idea of the “running service” blueprint in Port. The “running service” entity lets us see the service “in the wild” — in the specific environment it actually lives in. Only this provides us with the correct and actionable context to understand what is going on.

‍ Sticking to a static software catalog with a static data model that only includes metadata and not runtime data doesn’t provide the context we need. Insights exist only if you look at the real instance of the running microservice.

A Kubernetes Internal Developer Portal: The ‘Running Service’ Blueprint

Some argue that the growth of Kubernetes is one of the core drivers behind platform engineering. Kubernetes complexity, the expertise required of its practitioners and the recent movement of many developers to cloud native development all created increased load and friction between developers and DevOps.

Internal developer portals abstract Kubernetes away for developers. They let developers understand Kubernetes by showing them the relevant data, in context. They also support developer self-service actions. It’s important to ensure that these Kubernetes internal developer platforms include:

All Kubernetes objects in the software catalog, not just microservices
Multicluster support
CRD support

Let’s look at how to set up blueprints (the data model) for a Kubernetes internal developer portal and then at how and when we include the running service blueprint for Kubernetes.

This is the basic set of blueprints for Kubernetes:

Workload is the “running service” for Kubernetes. It is a generic name for stateful sets, deployments, daemon sets and any other workload running in the cluster.

A cluster represents a Kubernetes cluster in the infrastructure, providing the high-level connection between the different objects in the Kubernetes cluster.
A node is a server that hosts and provides the runtime for the different applications and microservices in the Kubernetes cluster.
A namespace is meant to group together many resources inside the same Kubernetes cluster, giving you the option to view how a complete environment hosted from the same Kubernetes cluster is connected.
The workload is meant to be the focal point that provides the most relevant context to a developer about how their app is doing. The workload entity provides the developer with an abstract view of their different workloads. They can see the current status of the workload, such as instance count and health. By going upstream in the dependency tree, the developer can see what other applications and microservices are running next to their own workload, letting the developer understand if there are any connectivity or functionality issues.
A pod is an instance of the workload, giving us visibility into the health of the pieces that make up the complete workload, as well as the ability to understand if there are any specific issues in the availability of the service provided by the workload.

You Should Begin Using a Running Service or Workload Blueprint

We’ve seen that the runtime blueprint, regardless of whether we call it “running service,” “workload” or even the literal “service in environment,” is useful. It reflects the reality in which a single service usually exists in several environments at the same time, such as development, staging, etc. It can also be deployed in many different customer environments. The running service provides the runtime data so we can understand the service in the context of its environment and deployment, as well as its real-time information, from uptime to status.

You can use Port for free at getport.io, or check a fully populated Port demo here.

The post ‘Running Service’ Blueprint for a Kubernetes Developer Portal appeared first on The New Stack.

Building GPT Applications on Open Source Stack LangChain

Akmal Chaudhri — Wed, 07 Jun 2023 13:58:49 +0000

This is the first of two articles.

Today, we see great eagerness to harness the power of generative pre-trained transformer (GPT) models and build intelligent and interactive applications. Fortunately, with the availability of open source tools and frameworks, like LangChain, developers can now leverage the benefits of GPT models in their projects. LangChain is a software development framework designed to simplify the creation of applications using large language models (LLMs). In this first article, we’ll explore three essential points that developers should consider when building GPT applications on the open source stack provided by LangChain. In the second article, we’ll work through a code example using LangChain to demonstrate its power and ease of use.

Quality Data and Diverse Training

Building successful GPT applications depends upon the quality and diversity of the training data. GPT models rely heavily on large-scale datasets to learn patterns, understand context and generate meaningful outputs. When working with LangChain, developers must therefore prioritize the data they use for training. Consider the following three points to ensure data quality and diversity.

Data Collection Strategy

Define a comprehensive data collection strategy tailored to the application’s specific domain and use case. Evaluate available datasets, explore domain-specific sources and consider incorporating user-generated data for a more diverse and contextual training experience.

Data Pre-Processing

Dedicate time and resources to pre-process the data. This will improve its quality that, in turn, enhances the model’s performance. Cleaning the data, removing noise, handling duplicates and normalizing the format are essential well-known pre-processing tasks. Use utilities for data pre-processing, simplifying the transformation of raw data into a suitable format for GPT model training.

Ethical Considerations

There may be potential biases and ethical concerns within the data. GPT models have been known to amplify existing biases present in the training data. Therefore, regularly evaluate and address biases to ensure the GPT application is fair, inclusive and respects user diversity.

Fine-Tuning and Model Optimization

A pre-trained GPT model provides a powerful starting point, but fine-tuning is crucial to make it more contextually relevant and tailored to specific applications. Developers can employ various techniques to optimize GPT models and improve their performance. Consider the following three points for fine-tuning and model optimization.

Task-Specific Data

Gather task-specific data that aligns with the application’s objectives. Fine-tuning GPT models on relevant data helps them understand the specific nuances and vocabulary of the application’s domain, leading to more accurate and meaningful outputs.

Hyperparameter Tuning

Experiment with different hyperparameter settings during the fine-tuning process. Adjusting hyperparameters such as learning rates, batch sizes and regularization techniques can significantly affect the model’s performance. Use tuning capabilities to iterate and find the optimal set of hyperparameters for the GPT application.

Iterative Feedback Loop

Continuously evaluate and refine the GPT application through an iterative feedback loop. This can include collecting user feedback, monitoring the application’s performance and incorporating improvements based on user interactions. Over time, this iterative approach helps maintain and enhance the application’s accuracy, relevance and user satisfaction.

User Experience and Deployment Considerations

Developers should not only focus on the underlying GPT models, but also on creating a seamless and engaging user experience for their applications. Additionally, deployment considerations play a vital role in ensuring smooth and efficient operation. Consider the following three points for user experience and deployment.

Prompt Design and Context Management

Craft meaningful and contextually appropriate prompts to guide user interactions with the GPT application. Provide clear instructions, set user expectations and enable users to customize and control the generated outputs. Effective prompt design contributes to a better user experience.

Scalable Deployment

Consider deployment strategies that ensure the scalability and efficiency of the GPT application. Use cloud services, containerization and serverless architectures to effectively handle varying workloads and user demands.

Continuous Monitoring

Implement a robust monitoring system to track the performance and usage patterns of the GPT application. Monitor resource utilization, response times and user feedback to identify potential bottlenecks and areas for improvement.

Summary

By considering these three key aspects — quality data and diverse training, fine-tuning and model optimization and user experience and deployment considerations — developers can build powerful GPT applications on the open source stack provided by LangChain. In an upcoming article, I’ll start exploring the potential of GPT models and LangChain through a worked example. I will also host a workshop on June 22 during which I will go through building a ChatGPT application using LangChain. You can sign up here.

The post Building GPT Applications on Open Source Stack LangChain appeared first on The New Stack.

SRE vs. DevOps? Successful Platform Engineering Needs Both

Paige Cruz — Tue, 06 Jun 2023 17:53:56 +0000

When talking about cloud native computing and digital transformation, two industry terms frequently appear: site reliability engineering (SRE) and DevOps. Often, they’re mentioned in opposition: SRE versus DevOps. But that is wrong.

To succeed in the cloud native world, organizations need both DevOps and SRE. Moreover, teams need a third element to assure transformation success as they move into the cloud native world: a platform engineering team.

That makes it important to understand the definition of each term, the distinctions between them, what they do and how they benefit business, as well as why organizations need all three to succeed.

What Is DevOps?

DevOps is a software methodology, but also an IT culture. It combines software development and IT operations to streamline software and services delivery with the objective of building software more efficiently, as well as harness automation as much as possible to drive faster deployment of higher-quality software. Its overall goal is to make system changes easier and rely on continuous improvement instead of massive improvement initiatives.

DevOps’ cultural implications come from its emphasis on enhanced collaboration and communication between different teams. Developers, operations staff, quality assurance (QA) professionals and security specialists all work together using automation tools to accelerate and standardize the development process. These teams also use CI/CD techniques to test, integrate and deploy software changes as quickly and reliably as possible.

What Problems Does DevOps Solve?

Legacy software development practices such as waterfall are typically quite slow and can cause conflicts between developers and operations teams. Prior to DevOps, the development team would already be working on a new project by the time operations completed QA and security checks. The organizational silos between development and operations discouraged collaboration to fix issues, instead promoting finger-pointing. This frustrated business clients and other stakeholders who were impatiently waiting for an application to move into production.

DevOps also solves the testing issue in traditional development environments. Without rigorous testing, software bugs can go undetected, which leads to unplanned downtime of critical production systems, user frustration and even lost revenue. With CI/CD, DevOps implements testing earlier, avoiding the last-minute rush to test quickly and push apps out the door.

Security is another critical issue. DevOps incorporates continuous security audits as an integral part of the development process to identify and address vulnerabilities before bad actors exploit them.

Benefits of DevOps

Some advantages of a DevOps culture include:

Faster time to market: DevOps enables organizations to bring new products and features to production faster through a streamlined development process and by eliminating bottlenecks.
Improved collaboration: Having teams working together helps to reduce silos and improve communication across the organization.
Better quality: With testing and deployment automation, DevOps can help to reduce the number of errors and improve the overall quality of the software.
Increased efficiency: Automation aids in velocity by reducing repetitive tasks and manual intervention.
Greater scalability: DevOps provides a framework to build scalable and resilient software capable of supporting rapidly growing businesses.

What Is SRE?

Site reliability engineering (SRE) is a discipline that applies software engineering to operations to build and maintain highly reliable and scalable applications. SRE started at Google but is now widely adopted throughout the technology industry.

Part of the SRE creed is that “every failure is an opportunity for learning” and thus engineers must find the problem’s contributing factors and make adjustments at the system level to ensure that particular issue doesn’t resurface.

What Problems Does SRE Solve?

First and foremost, SRE tries to reduce system outages and downtime by identifying and addressing issues quickly. With investigations and incident analyses, SRE teams contribute to the DevOps team’s ability to build and modify systems to be highly available and resilient by design.

SRE helps system performance to ensure that software in production meets all user needs, whether internal or external. The SRE team also monitors usage patterns and capacity to ensure that the IT environment can handle expected traffic, avoiding overloading and service disruption.

SRE teams collaborate closely with DevOps teams to confirm that issues are truly resolved. There is a constant feedback loop between SRE and DevOps to guarantee that flaws are fixed at the source and not just temporarily patched.

The Benefits of SRE

Beyond improving systems reliability — its primary objective — SRE teams help design operable systems that are less likely to fail or experience unplanned downtime. SRE promotes:

Faster incident resolution: With a data-driven approach to issue identification, SRE teams can address them quickly and reduce the time to detect and resolve incidents.
Efficient resource utilization: SRE teams optimize resource usage to ensure that systems can scale efficiently without requiring significant additional resources.
Improved collaboration: Close work with development teams ensures that software is designed with reliability in mind from the outset.
Greater automation: SRE teams use automation to reduce the risk of human error and increase efficiency, which frees up both DevOps and SRE teams’ time for more strategic work.

What Is Platform Engineering?

Platform engineering is the practice of building and maintaining an internal software platform — consisting of tools, services, and infrastructure — that lets developers effectively and efficiently build, deploy, operate and observe applications. Platform engineers’ objective is to enable developers to focus on writing code rather than infrastructure issues.

Many platform engineering teams designate “golden paths” for application development in pursuit of maximum reliability, quality and developer productivity. Golden paths are pre-architected and supported approaches to build and deploy software. If development teams use golden paths, then the platform engineering team supports production, and developers don’t have to learn all the underlying technology. This dramatically accelerates an application’s time to market.

Platform engineers monitor developer efficiency for the entire software development life cycle, from source code to production, to ensure that developers have the required tools and support to produce the highest-quality applications.

What Problems Does Platform Engineering Solve?

Platform engineering directly addresses the overall developer experience. Developers are getting more frustrated. According to a recent survey, DevOps team spend, on average, more than 15 hours each week on activities other than coding.

This includes internal tool maintenance, development environment setup and pipeline debugging. The cost of this is astronomical. In the United States alone, businesses are losing up to $61 billion annually, according to Garden.io.

The complexity of managing today’s cloud native applications drains DevOps teams. Building and operating modern applications requires significant amounts of infrastructure and an entire portfolio of diverse tools. When individual developers or teams choose to use different tools and processes to work on an application, this tooling inconsistency and incompatibility causes delays and errors. To overcome this, platform engineering teams provide a standardized set of tools and infrastructure that all project developers can use to build and deploy the app more easily.

Additionally, scaling applications is difficult and time-consuming, especially when traffic and usage patterns change over time. Platform engineering teams address this with their golden paths — or environments designed to scale quickly and easily — and logical application configuration.

Platform engineering also helps with reliability. Development teams that use a set of shared tools and infrastructure tested for interoperability and designed for reliability and availability make more reliable software.

It also allows developers to access the tools they need themselves. Instead of using an IT ticketing system or having a conversation about creating a new database, a developer can simply spin it up in a user interface and know the configuration of any alerts, replications and operating parameters.

Finally, platform engineering addresses the high cost of building applications the traditional way, in which the development team purchases a broad range of tools and environments, frequently with overlapping functionality. Through standardization and automation, platform engineering minimizes these costs.

The Benefits of Platform Engineering

A well-designed development platform with tested and optimized golden paths helps developers build and deploy applications faster with pre-built components and infrastructure. This reduces the amount of required time and effort to build and configure these components from scratch. Other benefits include:

Standardization and consistency: Platform engineering delivers a standard set of tools and infrastructure to ensure that all applications built on the platform are consistent and meet the same quality standards.
Scalability and flexibility: Environments provided by the platform engineering team enable developers to deploy and scale applications quickly and easily.
Reduced operational costs: With task automation for deployment, monitoring and scaling, platform engineering frees up DevOps teams to focus on more strategic work.
Improved application reliability and availability: A platform engineering team provides a set of shared tools and infrastructure specifically designed for high uptime and 24/7 access.

Puppet’s 2023 State of DevOps Report found that platform engineering multiplies the chances of DevOps success.

What Are the Differences Between DevOps, SRE and Platform Engineering?

Organizations venturing into the cloud native world must do things differently to get transformative results; cloud native problems require cloud native solutions.

The first step is usually to adopt a DevOps culture if they don’t already have one. But DevOps needs support to make the transition and operate in cloud native environments. SRE and platform engineering teams provide such support.

It might be possible to get by with just two — or even one of these teams — but an organization aiming to modernize some or all of their workloads to cloud native should consider establishing all three teams.

DevOps: Responsible for the complete life cycle of the apps, from source to production and modifies/enhances apps post-production.
SRE: Primarily focused on application scalability, reliability, availability and observability. This team typically acts in crisis management mode when the performance or availability of an app is at risk.
Platform engineering: The definition is still evolving, but platform engineering’s role of setting standard tools and processes to speed development is acknowledged as an extraordinarily helpful bridge for DevOps to make the transition from monolithic to microservices-based cloud native computing.

Each team has a specific role and objectives, yet all three work together best to ensure the business can deliver cloud native applications and environments according to industry best practices.

How Chronosphere Supports All Three

The addition of DevOps, SRE and platform engineering teams boosts cloud native adoption and succeeds when these teams have complete visibility into their cloud native apps and cloud environments. This comes from a new generation of monitoring and observability solutions.

Cloud-hosted monitoring and application performance monitoring (APM) were born in the pre-cloud native world, one with very different assumptions. It’s no wonder they struggle with cloud native architectures. A cloud native observability solution like Chronosphere that is architected for modern digital business and observability can tie all three teams together.

With cloud native monitoring and observability, increased visibility into overall metrics usage and the power to set quotas for quickly growing services, Chronosphere gives organizations the flexibility and control they need over the entire application life cycle.

The post SRE vs. DevOps? Successful Platform Engineering Needs Both appeared first on The New Stack.

The Art of Platform Marketing: You’ve Gotta Sell It

Michael Coté — Tue, 06 Jun 2023 13:05:41 +0000

“How do we get developers to actually use our platform?”

This is a question I’m often asked. A good first step is to make sure you take a product management approach and build an app platform that developers actually want to use, making sure that the golden path to production is not only useful but, well, fun. However, there is a second step that is often overlooked and misunderstood by platform teams: good, old-fashioned marketing. Once you have your platform set up, you have to build up what is essentially a full marketing plan to drive interest in and use of that platform. This includes not only brand, messaging and positioning and campaigns for outreach, but also platform advocacy.

What Platform Marketing Does

Platform marketing is used to drive awareness, trust and interest, but it also gives you an opportunity to get product management feedback about your platform. That last part is one of the underappreciated parts of advocacy (or “developer relations” as it’s sometimes called). When you’re developing in a product mindset, as most platform teams do, you’ll appreciate as much feedback as you can get from your customers — your developers. When infrastructure teams tell me they’ve built a platform or a Kubernetes cloud for developers but that developers aren’t using it, it’s usually because they need to do much more platform marketing.

Marketing doesn’t come easy to infrastructure people. It’s an off-putting word, perhaps only rivaled by “enterprise sales rep.” As ever with eye-roll-inducing phrases, what people actually dislike is bad, boring and useless marketing. At large organizations, most of the successful platform teams I talk with pay close attention to marketing, to good marketing. The likes of Mercedes-Benz, JPMorgan Chase, Duke Energy, The Home Depot, BT, the U.S. Air Force and Army and many others start their platform marketing plans from day one. And, in fact, marketing is a key part of scaling and sustaining how these organizations improve the way they make software.

I’ll be covering platform marketing as one of the “7 lessons from 7 years of running platforms” in my upcoming talk at PlatformCon, being held June 8 and 9. In the meantime, here’s a preview of one of those seven lessons: marketing and advocacy.

Brand

“Do you have a T-shirt yet?” my colleague DaShaun Carter likes to ask platform teams. This can seem like a flippant question, but it gets to an important part of platform marketing: establishing a brand. You need a name for your platform and the philosophy of software it supports. For example, the U.S. Air Force uses the brand Kessel Run, and JPMorgan Chase has the Gaia brand.

A brand performs two functions.

First, it creates an identity and a definition of what exactly your platform is. People tend to identify with the tools they use. They’re Java developers, Rust developers, Linux administrators, they follow XP or they’re site reliability engineers (SREs) instead of “DevOps engineers,” and so forth. That identity creates affinity and attraction to the brand — in this case, your platform. In doing so, it creates a certain joy in using the platforms and a passion for the platform.

Second, a brand helps define what your unique methodology and philosophy is. No matter if you’re doing agile, following DevOps principles, practicing SRE or sorting out what “platform engineering” means this quarter, you’ll need to adapt those methodologies to your organization’s unique needs. The sages of these methodologies aren’t so fond of cafeteria DevOps, where you just pick and choose the practices you want to use. However, in many large organizations, to get better, you need to make compromises and adapt stringent methodology principles.

Using your own name helps you take ownership of the methodology you’re putting together and change it as you learn what works. It’s a good time saver too. As one executive told me on a long elevator ride a few years back, don’t ever use the word “agile” when you’re doing agile. The first thing that’ll happen is that someone will start complaining that you’re not doing real agile, that you’re doing it wrong. And then you get stuck in the narcissism of a small-differences black hole instead of just getting on with the work.

The Book

You’re certainly going to need a manual, training, documentation and the usual three-ring binder material. But you’ll also want to write up the thinking that’s behind the brand. You need to codify and distribute your intentions, goals and principles. This is something more tactical, more usable than vision and strategy.

The exact content of The Book will vary, so it’s good to look at examples for inspiration. While it’s just a narrow slice of what would be in The Book, the UK Digital Service has a great list of design principles. You can see how we think about software at VMware Tanzu Labs in things like our FAQ and books like “Radically Collaborative Patterns for Software Makers.”

As you scale your platform to hundreds, then thousands of developers, this ongoing documentation of your thinking will be critical. It’s more than just tech documentation, it’s documenting the culture that your platform is built to support. This book will also help the platform team remember the point of the platform and their work as well. For example, to get the organization focused on building well-designed software, using lean-design techniques, deploying weekly, etc.

Platform Advocacy

Finally, the successful platform teams I talk with have very active platform advocacy. This means having at least one person working full time to just talk with, work with and listen to the people who use your platforms, usually developers. The role of “developer advocate” is pretty well understood by us vendors and cloud providers. Developer advocates love talking to people, and we also love talking about our craft. This means you can find out how it’s done easily by just asking us.

You’ll probably start with just one platform advocate who visits with developer teams throughout your organization listening to what these teams do, teaching them how to use the platform and associated methodologies and listening to their feedback. The advocate acts as a spreader of your platform, a booster and an explainer. Also, often overlooked, the advocate takes feedback from developers and others back to the platform team. They advocate for both the platform team and for the platform users.

As your platform and overall software transformation scale, you’ll add more advocates. Like JPMorgan Chase, you might even have a whole team of platform advocates. The Cloud Foundry platform team at Mercedes-Benz provides training, systematic feedback collection, quarterly community updates and numerous other community management functions that you’d expect an advocate to help with.

One of the common, maybe required, practices the advocacy team follows is holding quarterly internal conferences. These are actual, in-person conferences, often rotating through different regions and office locations with an online component. At these conferences, your platform team and executive sponsors talk a little bit about the platform, but you mostly get your customers — developer teams — to present and talk about the projects they’ve worked on. This serves two functions: training and, that’s right, marketing.

The marketing you’re taking advantage of at internal conferences is the most coveted of all marketing treasures: word of mouth. Having developers tell other developers that your platform is good, great even, will be the best way to get developers to use your platform, and use it well.

Start Platform Marketing on Day One

In addition to those important aspects of platform marketing, you’ll also need to do some marketing fundamentals, like producing content and documentation and working with product management to understand your customers and go to where they are, so to speak.

I haven’t seen many platform teams (or any, perhaps) that have scaled and sustained their developer platform without platform marketing. You’ve got to start thinking about marketing from day one, assigning at least one full-time advocate to start that work of creating a brand name and documenting your ongoing platform philosophy and principles. As with developer advocacy, you don’t need to spend time reinventing the wheel: Tech marketing is a well-understood set of practices. The trick is to actually do it.

If you want to hear the other six lessons of scaling and sustaining platforms in large organizations, check out my full talk at PlatformCon, ”7 lessons from 7 years of running platforms.”

The post The Art of Platform Marketing: You’ve Gotta Sell It appeared first on The New Stack.

7 Core Elements of an Internal Developer Platform

Viktor Farcic — Mon, 05 Jun 2023 13:41:04 +0000

What does it take to build an internal developer platform? What are the tools and platforms that can make it work? This post will discuss the architecture and tools required to stand up a fully operational internal developer platform. To see the actual steps of setting up the platform, watch this video.

Why Do We Want an Internal Developer Platform?

Platform engineering’s overarching goal is to drive developer autonomy. If a developer needs a database, there should be a mechanism to get it, no matter if that person is a database administrator or a Node.js developer. If a developer needs to manage an application in Kubernetes, that person doesn’t need to spend years trying to understand how Kubernetes works. All these actions should be simple to accomplish.

A developer should be able to accomplish what they need by defining a simple manifest or using a web UI. We want to enable all developers to consume services that will help them get what they need. Experts (platform engineers) will create those services in the internal developer portal, and users will consume them in its graphical user interface or by writing manifests directly and pushing them to git.

The High-Level Design of an Internal Developer Platform — 7 Core Elements

An internal developer platform needs several parts to become fully operational. For each part we will recommend a tool, but they can be exchanged with similar tools. The core idea is to map out the functionalities needed to build the platform:

A control plane: The platform needs a control plane that will be in charge of managing all the resources, no matter if they are applications running in a Kubernetes cluster or elsewhere, or if the infrastructure or services are in Amazon Web Services (AWS) , Azure, Google Cloud or anywhere else. Our recommended tool here is Crossplane.
A control plane interface: This will enable everyone to interact with the control plane and manage resources at the right level of abstraction. Our recommended tool here is Crossplane Compositions.
Git: The desired states will be stored in git, so we’ll have to add a GitOps tool into the mix. Its job will be to synchronize whatever we put in git with the control plane cluster. Our recommended tool here is Argo CD.
Database and schema management: Given that state is inevitable, we will need to have databases as well. Those databases will be managed by the control plane but to work well, we will also need a way to manage schemas inside those databases. Our recommended tool here is SchemaHero.
Secrets manager: For any confidential information that we cannot store in git, we’ll need a way to manage secrets in a secrets manager. Those secrets can be in any secrets manager. Our recommended tool to pull secrets from there is External Secrets Operator (ESO).
An internal developer portal/ graphical user interface: In case users don’t want to push manifests directly to git, we should provide them with a user interface that will enable them to see what’s running as well as to execute processes that will create new resources and store them in git. Our recommended tool here is Port.
CI/CD pipelines: Finally we will need pipelines to execute one-shot actions like the creation of new repositories based on templates, building images with new release changes to manifests and so on. Our recommended tool here is GitHub Actions.

The setup will require a few additional tools, but the list above is a must.

The diagram below shows how each of the elements interacts with each other. You can use it as a reference as you read through this article.

Let’s examine the role of each layer in the setup:

Control Plane

Let’s talk about control planes: We need a single API acting as an entry point. This is the main point of interaction for the internal developer platform. In turn, it will manage resources no matter where they are. We can use Crossplane with providers, which enables us to manage not only Kubernetes but also AWS, Google Cloud, Azure or other types of resources. We will use Kubectl to get custom resource definitions (CRDs) that will create deployments, services and manage databases in hyperscaler clusters, etc.

However, this alone isn’t enough for a full-fledged internal developer platform. An application can easily consist of dozens of resources. Infrastructure can be much more complicated than that. Most importantly, all those low-level resources are not at the right levels of abstraction for people who are not Kubernetes or AWS or Google Cloud specialists. We need something that is more user-friendly.

A User-Friendly Interface for the Control Plane

The control plane interface can act as the platform API when you’re 100% GitOps. It shouldn’t be confused with the internal developer portal, which acts as the graphical user interface. We can use Crossplane Compositions for that.

What is the right level of abstraction for the users of the platform we’re building? The rule is that we should hide, or abstract, anything that people don’t really care about when they use the internal developer platform. For instance, they probably don’t care about subnets or database storage. The right level of abstraction depends on the actual use of the platform and will differ from one organization to another. It’s up to you to discover how to best serve your customers and everyone else in your organization.

Crossplane Compositions enables us to create abstractions that can simplify the management of different kinds of applications. Next, we probably do not want anyone to interact directly with the cluster or the control plane. Instead of people sending requests directly to the control plane, they should be storing their desired states in git.

Synchronize from Git with GitOps

Changing the state of resources by directly communicating with the control plane should not be allowed, since no one will know who changed what and when. Instead, we should push the desired state into git and, optionally, do reviews through pull requests. If we plug GitOps tools into the platform, the desired state will be synchronized with the control plane, which in turn will convert it into the actual state.

This is a safer approach as it doesn’t allow direct access to the control plane and also keeps track of the desired state. I recommend doing this with Argo CD, but Flux and other solutions are just as good.

Schema Management

Databases need schemas. They differ from one application to another. To complete our internal developer platform, we need to figure out how to manage schemas, preferably as part of application definitions stored in git. There are many ways to manage schemas, but only a few enable us to specify them in a way that fits into the git model. The complication is that GitOps tools work only with Kubernetes resources, and that means that schemas should be defined as Kubernetes resources as well. This requires us to extend the Kubernetes API with CRDs that will enable us to define schemas as Kubernetes resources. I recommend using SchemaHero for that.

Secret Management

Some information shouldn’t be stored in git. Having confidential information such as passwords in git could easily result in a breach. Instead, we might want to store those in a secret manager like HashiCorp Vault or a solution provided by whichever hyperscaler you’re using. Still, those secrets need to reach the control plane so that processes inside it can authenticate with external APIs or access services, for example, databases. I recommend using External Secrets Operator (ESO) for that.

Internal Developer Portal — Graphical User Interface

The internal developer platform needs a user interface to sit on top of everything we’ve built so far. This is the internal developer portal. It both provides a catalog of services people can

use as well as an interface for developers to perform the actions we want them to use autonomously. Specifically, we need a way to initialize a process that will create new repositories for applications, add sample code, provide manifests for the databases and other dependencies, create CICD pipelines, and so on and so forth.

For this setup we began with the Kubernetes catalog template from Port.

We will then add two additional blueprints that will be related to the cluster blueprint, Backend App and Environment:

Environment

Backend App

CI/CD Pipelines

Finally, we need pipelines. They are the last piece of the puzzle.

Even though we are using GitOps to synchronize the actual state into the desired state, we need pipelines for one-shot actions that should be executed only once for each commit. These could be steps to build binaries, run tests, build and push container images and so on.

The Internal Developer Platform in Action

From the user (developer) perspective, a new application can be created with a simple click on a button in a Web UI or by defining a very simple manifest and pushing it to git. After that, the same interface can be used to observe all the relevant information about that application and corresponding dependencies.

Behind the scenes, however, the flow would be as follows.

The user interacts with a Web UI (Port) or directly with git. The job of the internal developer portal in this case is to trigger an action that will create all the necessary resources.
Creating all the relevant resources is a job done by the pipeline such as GitHub Actions. In turn, it creates a new repository with all the relevant files, such as source code, pipelines, application manifests, etc.
As a result of pushing changes to the application repository (either as a result of the previous action or, later on, by making changes to the code), an application-specific pipeline is triggered (GitHub Actions) which, as a minimum, builds a container image, pushes it to the image registry and updates the manifests in the management repo, which is monitored by GitOps tools like Argo CD or Flux.
GitOps tools detect changes to the management repo and synchronize them with the resources in the control plane cluster.
The resources in the control plane cluster are picked up by corresponding controllers (Crossplane), which in turn create application resources (in other Kubernetes clusters or as hyperscaler services like AWS Lambda, Azure Container Apps or Google Cloud Run) as well as dependent resources like databases (self-managed or as services in a hyperscaler).

The post 7 Core Elements of an Internal Developer Platform appeared first on The New Stack.

Open Source Jira Alternative, Plane, Lands

Darryl K. Taft — Fri, 02 Jun 2023 16:00:57 +0000

A new(ish) open source Jira alternative called Plane has landed on the scene and has begun generating developer interest.

The open source project hopes to become a viable option to Jira, which by some estimates holds as much as 86.63% of the bug-and-issue-tracking market, according to 6Sense. Jira’s top competitors in the bug-and-issue-tracking category are BugHerd, YouTrack and Trac.

However, Jira is much more than bug and issue tracking; it is a whole suite of work management software, including project management, collaboration, configuration management and more.

Plane Explained

Moreover, the open source Plane tool “helps you track your issues, epics, and product roadmaps in the simplest way possible,” the project’s GitHub description reads. “Meet Plane. An open source software development tool to manage issues, sprints, and product roadmaps with peace of mind.”

Indeed, “Plane is a simple, extensible, open source project and product management tool powered by AI. It allows users to start with a basic task-tracking tool and gradually adopt various project management frameworks like Agile, Waterfall, and many more, wrote Vihar Kurama, co-founder and COO of Plane, in a blog post.

Yet, “Plane is still in its early days, not everything will be perfect yet, and hiccups may happen. Please let us know of any suggestions, ideas, or bugs that you encounter on our Discord or GitHub issues, and we will use your feedback to improve on our upcoming releases,” the description said.

Plane is built using a carefully selected tech stack, comprising Next.js for the frontend and Django for the backend, Kurama said.

“We utilize PostgreSQL as our primary database and Redis to manage background tasks,” he wrote in the post. “Additionally, our architecture includes two microservices, Gateway and Pilot. Gateway serves as a proxy server to our database, preventing the overloading of our primary server, while Pilot provides the interface for building integrations. We have also developed an AI service based on OpenAI, incorporating LangChain as an interface.”

Key Features

Key features of Plane include:

Issue Planning and Tracking: Quickly create issues and add details using a powerful rich text editor that supports file uploads. Add sub-properties and references to issues for better organization and tracking.
Issue Attachments: Collaborate effectively by attaching files to issues, making it easy for your team to find and share important project-related documents.
Layouts: Customize your project view with your preferred layout — choose from List, Kanban, or Calendar to visualize your project in a way that makes sense to you.
Cycles: Plan sprints with Cycles to keep your team on track and productive. Gain insights into your project’s progress with burn-down charts and other useful features.
Modules: Break down your large projects into smaller, more manageable modules. Assign modules between teams to easily track and plan your project’s progress.
Views: Create custom filters to display only the issues that matter to you. Save and share your filters in just a few clicks.
Pages: Plane pages function as an AI-powered notepad, allowing you to easily document issues, cycle plans, and module details, and then synchronize them with your issues.
Command K: Enjoy a better user experience with the new Command + K menu. Easily manage and navigate through your projects from one convenient location.
GitHub Sync: Streamline your planning process by syncing your GitHub issues with Plane. Keep all your issues in one place for better tracking and collaboration.

Factors a Developer Could Love

“In looking through their docs, I can see why developers are picking up on this project, as a) it features a nice permissive Apache 2.0 license, b) has some fairly consistent development work, and most importantly, c) plugs into GitHub where Plane can help greatly in imposing some sort of order upon software development efforts,” said Brad Shimmin, an analyst at Omdia. “By the way, it looks like they’re just getting started in plugging in generative AI functionality (used right now just for problem documentation), but I would imagine we’ll see community members extend that aggressively across the product in the coming months.”

Plane Pages

Tough Road Ahead

Plane has a tough road ahead if its goal is to nibble at Atlassian’s market share with Jira.

“It’s too early to tell [how far the project will go]. It’s at version 0.7, and it has two pricing options: ‘$0’ and ‘coming soon,’” said Jason Bloomberg, an analyst at Intellyx. “I’ve seen several similar tools over the years come and go. It’s difficult for a tool to gain traction when it has to connect to so many things and operate at the center of people’s day to work properly. But then again, everybody hates Jira, so you never know!”

According to some estimates, Jira’s overall market share is 42.32% with more than 93,976 companies using this software.

Yet, “There is always room to disrupt the incumbent,” said Holger Mueller, an analyst at Constellation Research. “Somehow the larger suites always get slower and less innovative, which opens the room for startups and new initiatives. The first thing is they need to do something better, they need to get known — neither is not an issue for Plane. The last is the switching cost, and that is also not too high here. So it will be interesting. If Plane will hurt Jira subscriptions — they will react as well.”

Getting Started with Plane

The easiest way to get started with Plane is by creating a Plane Cloud account. Plane Cloud offers a hosted solution for Plane.

“Currently, Plane Cloud is hosted on Vercel for frontend deployment and on Amazon EC2 for backend services,” Kurama said. “You can self-host your own version of Plane using our Docker images or Docker Compose, which are readily available in our repository.”

The post Open Source Jira Alternative, Plane, Lands appeared first on The New Stack.

How to Build a DevOps Engineer in Just 6 Months

Keri Barnett-Howell — Thu, 01 Jun 2023 17:43:03 +0000

In “The Rocky Horror Picture Show,” Dr. Frank-N-Furter sings that he can make a man in just seven days. We’re not that good here at Mission Cloud, but we can make a DevOps engineer in just six months.

We have built an intensive, six-month training program that turns recent graduates and career changers into DevOps engineers. For those in the back of the room, I’ll say it again: We can build DevOps engineers in six months.

I’m not talking about front-line engineers who troubleshoot and follow runbooks: I mean engineers who can build infrastructure. Engineers who can code, who understand containerization, who can wiggle their way into a customer’s environment and work with the team to modernize the heck out of it.

This blog post will dive into why we decided to build this program and the steps we took to make the program successful.

Why a Homegrown DevOps Engineer Program Is Necessary

By some estimates, in the United States, there are only about 6,800 trained DevOps engineers, but over 250,000 active DevOps engineer job openings. Needless to say, it is incredibly difficult to find DevOps engineers, but it shouldn’t be.

Cloud technology has been around for a long time, and the requisite skills to build and change cloud infrastructure aren’t a secret. And yet, it’s hard to find schools with robust cloud engineering programs. A few colleges and universities have built certificate programs, and private, for-profit boot camps have sprung up, but none of them can build what Mission Cloud and other cloud consulting companies need: a builder. A strong DevOps engineer with the breadth of skills needed for modern cloud computing.

Current Schools and Boot Camps Aren’t Cutting It

Part of the problem is there are just so many skills. Cloud computing continues to expand. For example, Amazon Web Services (AWS) adds dozens of new services a year, making it incredibly difficult for any one person to keep up, never mind training programs, which can take years to develop.

When we audited the skills of our engineering teams, we found over 200 skills that each engineer needed to have a handle on. We managed to edit it down to around 150 required skills, but that still left us reeling. As the appetite for cloud industries continues, Mission Cloud needs more and more engineers, but how can we find people who have all of those skills?

The Small Talent Pool Isn’t Getting Any Larger

Finding people who can do the work is a challenge. Technical recruiting in cloud engineering is one of the toughest jobs, because there is a limited pool of talented engineers, and it is growing far too slowly for the industry’s needs. Most of the DevOps engineers in the industry learned the trade almost by accident, as their system administration work slowly transitioned into the cloud.

Another problem was that the limited pool of engineers wasn’t very diverse. We couldn’t rely on people falling into cloud as their careers took unexpected turns, and cloud engineering became more diverse on its own. We needed another solution.

A favorite Buddhist saying goes: “When the only hope is a boat, and there is no boat, I will be the boat.” Mission Cloud was trapped on the same shore as the entire cloud industry: burning out the precious few talented engineers and waiting for somebody else to solve the problem.

We were a tiny startup, weaving our way between the behemoth cloud providers. Was it possible for us to be the boat? Could we create the change we wanted to see in the industry and get to that other shore?

And So, the Journey Began

Our options were limited. If engineers were not arising through spontaneous generation, then we needed to build our own. We started small, literally as small as possible: one employee. Someone in sales who wanted to get into the technical side of the house. Could we turn him into an engineer?

It took almost a year, with about a million missteps along the way, but the answer was, ultimately: yes. We sent him through job rotations in different technical departments and sent him different study plans and certification requirements, most of which were a shot in the dark, but he got there in the end. He became an extremely successful pre-sales solutions architect.

So, there we were, with the seed of a really big idea starting to plant. At that point, we’d proved that it could work, but not that it could work at scale. Maybe we just lucked into a secret genius and this was lightning in a bottle, but maybe it wasn’t. Maybe these skills could be taught, and learned, and applied in the right ways through an engineer training program.

When asked how, exactly, our engineers learned DevOps and cloud engineering skills, “I just Googled it,” was the most common answer.

I was, to be terribly transparent, absolutely not the right person to try to build this program. Despite leading the training department, before I joined Mission Cloud I had only the vaguest sense of how the internet worked (Were there cables somewhere? Under the ocean, perhaps?) I was, ahem, of the dial-up generation — my dim understanding of computers began and ended with zeros, ones and the horrible screeching sound of the modem connecting. (I say all this with some shame; my father was a software engineer and tried to incept some of his knowledge into my head but, alas, it never stuck.)

I had one thing going for me though: an absolutely bullish belief that there is no skill in the world that cannot be taught. This belief survived through many conversations with our good-natured engineers, who endured me grilling them on how, exactly, they learned DevOps and cloud engineering skills. “I just Googled it,” was the most common answer, frustrating me to no end. They had learned the skills because they encountered situations they didn’t know how to solve, and they had the desire to solve them. Curiosity and persistence are wonderful personality traits, and certainly made for good engineers, but I could not find the answers I needed.

Our engineers had learned the skills because they encountered situations they didn’t know how to solve, and they had the desire to solve them.

We knew other companies — big companies with lots of resources — had internship programs to try and teach these skills, but they didn’t seem all that successful. We weren’t seeing the market flooded with hundreds of recent internship graduates, so I didn’t think copying other companies was the way to go.

We needed to build something unique, something so closely embedded into the fabric of Mission Cloud that there would be a seamless transition from the training program into full-time engineering roles.

The Search for the Perfect Teacher

I was stuck on the shore again, so I went in search of a boat. I needed someone to build this program: a talented engineer who no longer wanted to engineer, but wanted to teach, and was a good teacher. Teaching, surprisingly, is one of the most difficult skills out there. I learned this the hard way as an English teacher in Peace Corps China.

As a native English speaker, I knew English, but teaching it — breaking it down into its component parts and passing that knowledge along in a structured way, where lessons built on top of each other — was far beyond what my 21-year-old brain was capable of. Teaching is not a skill most people have, and finding a technical teacher —oh boy.

Kelby Enevold

I looked high, I looked low, I answered questions about the airspeed velocity of an unladen swallow, but at last my quest came to an end when I found Kelby Enevold. This former Army communications soldier had become a skilled AWS cloud engineer and trainer, and I was lucky enough to snap him up as my head of technical training programs.

I’m about to get into the meat of what we built, but all this was to introduce the main point: Building technical training programs is hard! It takes a lot of effort and investment. It is incredible that my small company believed in the vision of what we could build so strongly that they were willing to invest in an entirely new role. This is why we have been successful, though: Mission Cloud truly cares about cloud literacy and is willing to put money behind the idea.

Our Formula for a Successful Training Program

Kelby and I spent months building the learning paths, and then it was time to launch. We brought on several overlapping cohorts of interns and put them through the program. Although we’ve made a lot of tweaks to the training paths and skill development, the basic outline of the program remains the same:

We Pay Our Interns a Good Hourly Rate

Even though it takes them several months to produce work for the company, they are working the entire time. Their effort and time deserve remuneration. The days of unpaid internships are ding-dong-dead!

If you are a small company, you may think you can’t afford this, and it’s true — it might be a real stretch. Try to get creative though — perhaps fewer hours with more intense work or a shorter program or finding a community partner that can fund interns through government programs (like LA-Tech or America on Tech). Unpaid internships means limiting opportunities to people with financial means, which totally sucks, and works against expanding diversity in the industry.

The First Part of the Program Is Studying Only

This is because we have set the entrance bar somewhat low: knowledge of Linux, AWS and networking. Interns need dedicated time to beef up their skills before we can let them onto real client work. During this time, interns are expected to work through training paths, gain certifications and get the reps in to practice these new skills. Enevold built a meticulous training path, ensuring interns achieve each skill necessary to go on to the next step. He leads them through Linux Essentials first, then dives into AWS Solutions Architect Associate certification studies.

Interns start with Linux and basics like text editing, they learn about AWS while studying for Solutions Architect Associate, they learn about containers, they learn about git and Infrastructure as Code.

Interns are Embedded into Departments Doing Real Work for Real Clients

After the study period, interns start the intensive shadowing portion of the program. Although the program manager is their main resource, interns become part of the departments they shadow. They are assigned actual support tickets and have tangible responsibilities.

This piece took the longest for us to set up. Department heads were skeptical — give work to untested interns? Slow down client work? Their reservations made sense, but we kept pushing, and gradually champions in the department started to appear. They saw the incredible eagerness of the interns, how they threw themselves into their work, but more importantly, our interns had the skills.

All that studying, all those gorgeous study paths worked. Our interns were not a drag on anyone’s time. Much more quickly than we had even anticipated, they were able to be a productive part of Mission Cloud.

Interns Have at Least Three People to Lean on

Most programs have interns connected with just one person, and that person is responsible for the development of that intern, in addition to their full-time job. That naturally leads to a lot of dead time for the intern and wasted opportunities.

During shadowing, our interns are still managed by the technical program manager, who checks in with them weekly, if not daily, and helps them understand tasks they are struggling with. Interns also meet weekly with the department manager, who monitors their progress, and a mentor, who assigns them tickets and guides them through the daily work. With this triangle of support, interns always have someone to turn to.

The mentors, mostly senior engineers, were delighted to find that working with interns pushed them to improve. “The greatest take-away for me was clarifying and critically assessing my processes, both technical and organizational, because I’d never had to teach them to someone else before,” said Gabe Norton, senior DevOps engineer.

We Focus on the Mindset and Behaviors of Engineers, not Just Technical Skill

Engineering isn’t about pure skill. Just because I can chop up an onion and saute some chicken doesn’t mean I can write a cookbook. Our interns gain skills, but we still have to teach them how to apply a troubleshooting methodology. We place a huge emphasis on hands-on environments. Hands-on exercises really complete the learning loop. Learn the thing, apply the thing, probably break the thing, learn more about the thing!

We also teach our interns how to operate within a larger department. We show them when and how to escalate problems they cannot solve, and how to combat imposter syndrome. When they move to full-time roles, it’s pretty scary to suddenly have the training wheels removed. We support them through that transition and show them how to trust the skills they built.

We Build a Thorough DevOps Foundation

We provide interns with an incredible scope of skill development. They start off looking at our more than 150 skills on the skills matrix with a big ol’ gulp of anxiety, but we guide them through the varied topics. They start with Linux and basics like text editing, they learn about AWS while studying for Solutions Architect Associate, they learn about containers, they learn about git and Infrastructure as Code.

Then they learn about CI/CD pipelines and the fundamentals of Python. They even start learning about container orchestration. Each step along the way includes actual hands-on exercises. Each week we have team meetings and talk about the things that we’ve learned. And, along the way we’re also working on skills like communication, escalation and problem-solving to make sure they can be a fully functional team member.

The Result

This program can work for anyone; we’ve had a former chef, a former Marine and recent college graduates all go through the program, crush it and become full-time DevOps engineers. Right now about two-thirds of our interns meet the bar to become full-time employees; we want to get that up to 80%. What we’re incredibly proud of, though, is our contribution to making cloud industries more inclusive: 75% of our interns are racially diverse and 50% are gender diverse.

Investing in Your People is Worth Every Penny

No matter what industry you are in, the skills that we seek so furiously are skills that we can teach. Think about the fees you spend on recruiters, the time managers and staff put into interviewing, the brutal disappointment when a new hire doesn’t work out, and the process begins all over again. All of these things are accepted parts of corporate team growth, but they don’t have to be. We can build our teams from the ground up, with the exact skills needed to succeed in our companies.

This takes time, hard work, and yes, money, always money, but it will pay off. For Mission Cloud, it is paying off right now, as our interns-turned-employees blossom, pull others up behind them and energize our teams anew.

Kelby Enevold contributed to this article.

The post How to Build a DevOps Engineer in Just 6 Months appeared first on The New Stack.

Cloud Security: Don’t Confuse Vendor and Tool Consolidation

Rani Osnat — Tue, 30 May 2023 13:24:18 +0000

In the current macroeconomic climate, many organizations are looking to consolidate and work with a smaller number of vendors. It’s understandable. Not only are you reducing potential runaway costs and making vendor relationships easier to manage, you can also gain a more advantageous bargaining position on price. The fewer individual vendors a company has to deal with, the easier it is to manage purchasing, get legal clearances, request support and so on.

However, from a security professional’s end-user perspective, vendor consolidation doesn’t necessarily translate to greater efficiency. The reason is simple: Even when you consolidate vendors, you may not consolidate tools. Unless your vendor offers a truly integrated platform, you still end up working with a discrete set of disparate, disconnected solutions. Whether or not they happen to be provided by the same vendor doesn’t matter much.

This is a reality that cloud security teams know all too well today. As business folks push for vendor consolidation, cybersecurity practitioners are left to wonder what vendor consolidation actually means for them, or how it can improve security outcomes.

Let’s take a moment to explore this phenomenon, discuss why vendor consolidation doesn’t always yield the desired results “on the ground” and what to do to ensure that consolidation initiatives result in tangible benefits.

Why the C-Suite Loves Vendor Consolidation

To start, let’s consider why organizations prefer to consolidate cybersecurity tool vendors.

They do it because it helps streamline their business processes and has fewer vendors to interface with. They get a one-stop shopping process that — just like buying groceries at a supermarket instead of going to individual bakers, butchers, produce stands and so on — will save them time. It might also result in lower overall costs because vendors are more willing to offer pricing discounts when they are selling multiple products to a single customer.

Why Cybersecurity Vendor Consolidation Doesn’t Always Live Up to Its Promise

Unfortunately, simply buying solutions from fewer vendors doesn’t necessarily deliver the operation efficiencies or efficacy of security coverage — that entirely depends on the nature of those solutions, how integrated they are and how good the user experience is that they provide.

If you’re an in-the-trenches application developer or security practitioner, consolidating cybersecurity-tool vendors might not mean much to you. If the vendor that your business chooses doesn’t offer an integrated platform, you’re still left juggling multiple tools.

You are constantly toggling between screens and dealing with the productivity hit that comes with endless context switching. You have to move data manually from one tool to another to aggregate, normalize, reconcile, analyze and archive it. You have to sit down and think about which alerts to prioritize because each tool is generating different alerts, and without tooling integrations, one tool is incapable of telling you how an issue it has surfaced might (or might not) be related to an alert from a different tool.

In short, vendor consolidation without an integrated platform or tight integration between the different tools (that seldom exist) doesn’t make life any easier at all for cybersecurity practitioners. It might improve business efficiency for procurement but at the same time add overhead and reduce efficiency of security operations.

A Better Approach to Cloud Security Tooling

Fortunately, it doesn’t have to be this way. It’s possible to consolidate both vendors and tools — a strategy that yields tangible benefits from both a business perspective and a security operations perspective.

In the realm of cybersecurity, and particularly in cloud native security, this approach is possible when businesses choose to work with a vendor that offers a fully unified cloud native application protection platform, or CNAPP. In fact, Gartner expects cloud native security to consolidate from the 10 or more tools/vendors used today to a more viable two to three in a few years.

A true CNAPP will integrate all of the tools that practitioners need to operate efficiently into a single solution. It does away with context switching, and it ensures that teams can draw on all available contextual data when managing alerts and remediation workflows.

At the same time, if you choose a real end-to-end CNAPP developed by a single vendor, it will achieve the business-process consolidation that executives love and the operational efficiency. The business gets the one-stop cybersecurity shopping it longs for, while at the same time giving practitioners a solution that addresses all aspects of cloud native application security, across all stages of the application delivery life cycle.

A Holistic Approach to Cybersecurity Vendor Consolidation

The bottom line is this: Consolidation only works when organizations think in terms of vendor consolidation and tool consolidation at the same time. Consolidating vendors alone offers little value if it leaves practitioners struggling to manage discrete, poorly integrated tools, which in turn leaves the business at greater risk of cyberattack because cloud native security teams can’t identify or respond to risks as effectively when they lack a centralized, consolidated platform. It might deliver some cost benefits and easier vendor management, but those efficiencies might be canceled or even overridden by poor user experience, lack of consolidated policies, processes, and outcomes, and overall higher operational overhead.

The good news is that CNAPP solves this dilemma. A CNAPP platform worth its name delivers all-in-one protection that keeps business folks happy while also helping to maximize the operational efficiency of cybersecurity teams.

Contact us to learn more about how Aqua’s CNAPP platform helps organizations optimize business efficiency and cybersecurity readiness at the same time.

The post Cloud Security: Don’t Confuse Vendor and Tool Consolidation appeared first on The New Stack.

Take a Platform Engineering Deep Dive at PlatformCon 2023

Carrie Tang — Fri, 26 May 2023 17:00:45 +0000

The highly anticipated PlatformCon 2023 is fast approaching, accompanied by a colossal amount of industry hype, and it’s easy to see why. The two-day virtual conference, which will be held June 8-9, celebrates the more than 15,000-member platform engineering community and features a large lineup of renowned industry speakers. Thousands of platform engineers and practitioners from around the world will participate, welcomed by a packed schedule and the opportunity to dive deep into the latest platform engineering trends, solutions and best practices.

Reasons to Attend PlatformCon 2023

Attendees joining PlatformCon 2023 can expect to level up their platform engineering skills by networking with experts and joining a vibrant community of platform engineers, all dedicated to pushing boundaries. Participants can enjoy regional kickoff events, watch talks at their own pace and engage in speaker Q&A sessions over on the Platform Engineering Slack channel.

The full conference schedule is available here. Attendees will get the chance to:

Engage with renown industry speakers like Nicki Watt, CEO/CTO of OpenCredo; Bryan Finster, value stream architect at Defense Unicorns; Stephan Schneider, digital expert associate partner at McKinsey; Charity Majors, CTO at Honeycomb; and Manuel Pais of Team Topologies.
Meet professionals from all over the globe who share similar interests.
Be inspired by new insights and fresh ideas for platform engineering initiatives.
Explore multiple tracks and listen to top field experts tell their stories.

Hundreds of Captivating Talks Spanning Five Tracks

Over the two days, PlatformCon 2023 will feature a diverse range of compelling talks covering five tracks:

Stories: Practitioners will share their enterprise platform-building experiences, covering the journey from inception to implementation and rollout. Examples include Adobe’s Rohan Kapoor discussing the development of an Adobe internal developer platform for over 5,000 developers, addressing challenges, productivity measurement and learnings from a recently launched CI/CD product.

Tech: This track will delve into the technical aspects of developer platforms. Expect talks from speakers such as Susa Tünker from Humanitec who will discuss eliminating configuration drift between environments, and sessions on problem-solving using various tools such as Kubernetes, Infrastructure as Code, service catalogs and GitOps.

Blueprints: Speakers in this track will present proven platform blueprints, including McKinsey’s Mike Gatto and Stephan Schneider who will explore simplifying developer platform design through reference architectures. Attendees can expect other talks to highlight key design considerations and effective integration of developer platform tools.

Culture: Focusing on building developer platforms by engineers for engineers, this track will examine the cultural aspects of platform engineering. Among the topics that will be discussed are product management and the relationship between platform engineering, DevOps, and site reliability engineering. Nicki Watt from OpenCredo will address stumbling blocks hindering the creation of a great platform as a product, and offer counteractive solutions.

Impact: This track will explore the business value and impact of platform engineering initiatives. Analysts like Gartner’s Manjunath Bhat will provide value stories to demonstrate how platform engineering accelerates business outcomes, while other practitioners will discuss strategies for securing executive buy-in.

For anyone looking to level up their platform engineering skills, this is a great opportunity to learn, network with and be inspired by the best in the industry. Register now for new speaker updates, chances to get involved and details about in-person and virtual meetups.

The post Take a Platform Engineering Deep Dive at PlatformCon 2023 appeared first on The New Stack.

Developer Guide: A New Way to Build on the Slack Platform

Lauren Gil — Fri, 26 May 2023 15:30:52 +0000

Today more than a million developers use the Slack platform each week to build custom applications. It’s amazing to witness this level of creativity, such as building reminder bots and daily automations; integrating processes and support avenues; and building fun, inclusive tools.

But that’s not the end of the story. With the release of our new platform capabilities comes a variety of new possibilities.

What the Changes Unlock

The Slack platform now supports you in building better apps, faster. The changes allow you to move away from creating apps that predict common user behaviors (but aren’t always easy to tweak or adapt) toward building modular app pieces so you can solve user problems in a more efficient way.

Before, the system allowed apps to be created in a product-centric “closed box,” providing your end users with complete, end-to-end experiences.

With the new platform capabilities, it’s possible to create more modular, use-case-centric building blocks for Slack users to benefit from. Think of it like this: The improvements allow you to create your own pantry of reusable ingredients, consisting of functions and triggers, so you can easily make the perfect dish that suits your user needs. Along with a new command line interface (CLI) and Slack hosting, this makes the Slack platform a much better open system.

Streamlining App Creation

There are a lot of powerful APIs and features of the Slack platform, and some really sophisticated apps have been built using our APIs and first-party software development kit (SDK). Until recently, however, there were still hurdles for developers wanting to fully leverage the platform:

There wasn’t a streamlined, simple way to create an app.
Once an app was created, there wasn’t a lot in the way of scaffolding.
Every app had to be built from scratch each time, repeating project setup boilerplate.
Developers encountered friction points related to hosting, data storage and OAuth.

The new capabilities address these points. This includes:

Overall user and developer experience improvements
Better scaffolding
A Slack CLI
Improved functions, workflows and triggers
The ability to run your app on Slack

Watch this overview of the new features from developer advocate Jeremiah Peoples.

“[The improvements] now allow apps to expose their functionality as shared functions that can be combined into workflows (by the developer or user). This means apps no longer have to solve everyone’s unique problems and instead the Slack Platform allows users to solve their own problems unique to their business.”

What’s New

Let’s dive into how the Slack platform allows you to build better, faster.

Overall UX and DX Improvements

The new capabilities offer a more flexible way to build on Slack in no-code, low-code and pro-code ways. This enables better self-service automation and integration, unlocking more advanced use cases for all team members.

Additionally, teams now need fewer IT and security approvals because everything can be hosted on Slack, which is designed to be compliant and reliable by default.

Better Scaffolding

Slack now provides better scaffolding for building apps, setting you up for more success, faster.

For years, developers have requested an opinionated structure and approach to building apps. For instance, until now, education around certain topics has been complicated for newer builders. This is especially true for permissions (user vs. bot actors and how they differ, what each is allowed to do) and tokens (how each one uses its authentication token, how scopes map to permissions/actions within Slack and how those are associated to each auth token).

The new additions to the platform remove any potentially confusing considerations around tokens — apps now only need to specify the correct scopes they need to perform what they do.

So while the core APIs haven’t changed (our Web API and Events API still power the platform), and our granular permission-app model hasn’t changed (your app will still have a granularly scoped permission in order to run), we’ve shifted the mindset of the platform slightly and improved the developer experience, which in turn allows better apps to be built.

A Slack CLI

Have you made a Slack app before? If so, you’ll recall that it meant a lot of clicks on the Slack dev website graphical user interface, as there was a lot of configurations to set up. With the new Slack CLI, creating a Slack app is now as easy as writing slack create in your terminal.

The CLI runs on macOS, Linux and Windows and manages the configuration, creating a new app ID, managing your event subscriptions and scaffolding the code for you. The CLI also allows you to run your apps locally, test in a variety of environments and even deploy code, right from your terminal.

This means you can concentrate on automation without having to copy configuration parameters between systems. The CLI and app manifests can also be integrated into your broader software development life cycle, such as through continuous integration pipelines or other automations.

Functions, Workflows and Triggers

These are the three core building blocks that contribute to our modular architecture and help support flexibility and reusability:

Functions: Modular, reusable blocks of functionality, with a handful of built-in functions you can use like CreateChannel.
Workflows: A combination of functions make up a workflow. For instance, a workflow that welcomes new members and provides them with relevant info when they join a channel.
Triggers: Workflows are invoked by triggers, of which there are a few types: event, scheduled, webhook and link.

In some ways, functions, workflows and triggers were already present as steps from apps in the existing platform. However, we recognized that there were barriers to discovery and use, so we took steps to improve them. For instance, workflows now work across multiple channels and exist at a global level, so they can be used across channels and other surfaces, including an app home screen and canvases. Previously, workflows only worked in a single channel, so they had to be remade if they were to be reused.

Run Apps Completely on Slack, Securely

With the changes to Slack’s platform offering, you can now run your apps completely on Slack. This means you can do more, in less time, and focus on the app functionality itself.

By default, code deployed to Slack will automatically fulfill all the industry security and compliance standards that are built into Slack’s core product — from SOC 2 to FedRAMP, and more.

We’ve also added the Deno Slack SDK to the platform. Deno is a new runtime environment that offers secure-by-default architecture, modern web standards and both JavaScript and TypeScript support, which means built-in autocompletion and handy hints in your code editor.

What sets the platform apart is the sheer amount of options and support it offers. The development experience was fluid and expansive, helping us explore new ways to integrate our support structure directly in Slack.

Tyler Beckett
SaaS operations engineer, Workiva

In addition, Deno’s secure architecture gives developers built-in granular permissions and controls, like the ability to limit access to certain external domains, network hosts, file system directories and environment variables. Especially for enterprise customers, this is a big win for both developers and admins. These guardrails mean the ability to easily build enterprise-grade apps that Slack admins trust.

Another addition to the platform is datastores. Until now, developers needed a third-party tool (such as SQLite ORM) to store data. In comparison, datastores are an easy way to store and retrieve data. The platform now includes a new set of datastore APIs for managing application data that is stored on Slack infrastructure. This makes it easier than ever for you to create, read, update and delete your app data without using third-party tools.

App Manifest

Previously, app management was done only through a Slack web interface, and it took a lot of clicks to manage and change the particulars of your app. Now, you can programmatically manage your app through the app manifest file.

The manifest.ts file is the most important file in your app’s root directory. Especially in apps that follow a modular architecture, it’s the core of the app’s nervous system, coordinating all the moving pieces. Having it accessible in a file means you can now understand, change and share templates quickly, and they can run and be installed on any workspace.

Configure your app’s name, define functions and workflows, set bot scopes and so much more, all from the comfort of your favorite code editor (and with the Slack CLI).

Coming Soon

Workflow Builder (WFB) 2.0: With Workflow Builder, people can automate work for themselves and their teams without writing a single line of code. Since Workflow Builder launched in 2019, more than 540 million workflows have been created, with users launching 1.7 million workflows every business day.

Later this year, we’ll release a new in-Slack-client Workflow Builder UI. This will expose all the invaluable, existing Workflow Builder constructs — including functions, triggers and workflows — in a friendly point-and-click interface. This will improve the UX for motivated folks wanting to automate processes in Slack, supporting the next million users in building workflows that fit their individual use cases.

Some might already have experience with platform-as-a-service tools like Salesforce and be familiar with putting integrations together in just a few clicks. The new capabilities of the Slack platform mean they can easily transfer these skills to Slack. Do you have a colleague who learned how to add custom Slack emojis not long ago? Well, the new capabilities on Slack not only add things like a CLI and SDK for seasoned developers, they also make the entire user experience more accessible. This makes it easier for everyone to orchestrate work in Slack without needing to code.

There’s value for everyone (even developers) in being able to build workflows quickly, without code. Keep an eye out for these newer builders, as they could be creating custom Slack workflows in no time.

How to Get Started

Getting started on the platform, and taking advantage of all the new functionality, couldn’t be easier: Check out our quick-start guide.

The platform is especially helpful for scenarios where you can’t easily host on your own, or if the wider infrastructure is challenging to figure out (for instance, if you’re working in a large enterprise company with restricted hosted compute resources, challenging processes or bureaucracy barriers).

Tap into New Capabilities with an Existing App

If you have an existing app on the platform, you don’t necessarily need to upgrade or change anything; it will still work. In the future, you could consider rebuilding with the new functionalities available to you. However, bear in mind that apps built on the new platform can only be used in one organization at a time, so you cannot scale your app if that’s your intention.

You can tap into the new capabilities from existing apps today using:

Functions: Add functions to your app.
Metadata: Start sending metadata.

We’re Excited to See What You Build Next

To starting building with Slack’s new platform capabilities, visit these resources:

Learn more about our newest section on the Slack API site, where you can download the CLI and discover how to use all the new features.
Review the samples and templates for apps.
See how to build and deploy an app in just under 60 seconds.

If you have any questions, head over to the Slack Community Forum.

We look forward to seeing what you build!

The post Developer Guide: A New Way to Build on the Slack Platform appeared first on The New Stack.

Better Security with ChatGPT: Using AI’s Defensive Strengths

Jeff Goldman — Fri, 26 May 2023 13:00:45 +0000

While ChatGPT has grabbed negative headlines recently due to cybercriminals’ use of the technology to strengthen attacks, it can also be a formidable asset for cyber defense, helping companies maximize their security posture while promising to bridge any skills gaps in their workforce.

That’s particularly relevant as security teams become increasingly overwhelmed by an ever-expanding threat landscape — according to the results of a recent Cobalt survey, 79% of cybersecurity professionals say they’re having to deprioritize key projects just to stay on top of their workload.

Mike Fraser, vice president and field CTO of DevSecOps at Sophos, told The New Stack that generative AI has an enormous amount to offer to those overloaded security teams. “ChatGPT can be utilized for threat intelligence analysis, incident response guidance, security documentation and training generation, vulnerability management, security policy compliance, and automation,” he said. “With automation alone, the cybersecurity use cases are endless.”

The Cloud Security Alliance (CSA) recently published a white paper examining ChatGPT’s offensive and defensive potential in detail. CSA technical research director Sean Heide, one of the paper’s authors, said one key strength of the tool is that it allows users to simply ask in natural language for a specific attribute they need written for a task, or to make tasks more efficient with new suggestions.

“These tasks would typically take teams, depending on experience, a few hours to properly research, write out, test, and then push into a production scenario,” Heide said. “We are now seeing these same scripts being able to be accurately produced within seconds, and working the same, if not better.”

And Ric Smith, chief product and technology officer at SentinelOne, said it’s important to keep in mind that ChatGPT itself isn’t the only way to make use of large language models — dedicated solutions like SentinelOne’s recently announced AI-based threat-hunting platform can do it in a more focused way. “Companies need to think of LLMs as expert services and maintain a level of pragmatism in how and where they leverage generative AI,” he said. “You can create a fantastic generalist like GPT-4. But in reality, having a complex model is optional if the task is more focused.”

Bridging the Skills Gap

Chang Kawaguchi, vice president and AI security architect at Microsoft, said generative AI tools like his company’s Security Copilot can serve both to assist highly skilled employees and to fill in knowledge gaps for less-skilled workers. With Cybersecurity Ventures reporting a total of 3.5 million cybersecurity job vacancies worldwide (and expecting that number to remain unchanged until at least 2025), there’s a real need for that kind of support.

“We’re definitely hoping to make already skilled defenders more effective, more efficient — but also, because this technology can provide natural-language interfaces for complex tools, what we are starting to see is that lower-skilled folks become more effective in larger percentages,” Kawaguchi said.

At every level, Smith said, ChatGPT can simply make the work more approachable. “By enabling analysts to pose questions in their natural form, you are reducing the learning curve and making security operations more accessible to a larger pool of talent,” he said. “You are also making it easier to move more rudimentary operations to junior analysts, freeing veteran analysts to take on more thought work and sophisticated tasks.”

That’s equally true for the summarization and interpretation of data. “When you run hunting queries, you need to be able to interpret the results meaningfully to understand if there is an important finding and the resulting action that needs to be taken,” Smith said. “Generative AI is exceptionally good at both of these tasks and reduces, not eliminates, the burden of analysis for operators.”

It’s not that different, Smith said, from what spell check has done in freeing writers to focus on content rather than on proofreading. “We are lowering the cognitive burden to allow humans to do what they do best: creative thinking and reasoning,” he said.

Still, it’s not just about supporting less-skilled users. Different levels of generative AI capability, Kawaguchi said, are better suited for different levels of user expertise. At a higher level, he said, consider the potential of a tool like GitHub Copilot. “It can provide really complex code examples, and if you’re a highly skilled developer, you can clearly understand those samples and make them fit — make sure that they’re good with your own code,” he said. “So there’s a spectrum of capabilities that generative AI offers, some of which will be more useful to lower-skilled folks and some of which will be more useful to higher-skilled folks.”

Handling Hallucinations

As companies increasingly leverage these types of tools, it’s reasonable to be concerned that errors or AI hallucinations will cause confusion — as an example, Microsoft’s short video demo of Security Copilot shows the solution referring confidently to the non-existent Windows 9. In general, Kawaguchi said Security Copilot strives to avoid hallucinations by grounding it in an organization’s data or in information from trusted sources like the National Institute of Standards and Technology (NIST). “With grounding the data, we think that there’s a significant opportunity to, if not completely eliminate, greatly reduce the hallucination risk,” he said.

Basic checks and balances, Heide said, are also key to mitigating the potential impact of any hallucinations. “Much like there are review processes for development, the same will need to be taken around the usage of answers received from ChatGPT or other language models,” he said. “I foresee teams needing to check for accuracy of prompts being given, and the type of answers being provided.”

Still, Fraser said one of the key remaining barriers to adoption for a lot of companies lies in concerns about accuracy. “Thorough testing, validation and ongoing monitoring are necessary to build confidence in their effectiveness and minimize risks of false positives, false negatives or biased outputs,” he said.

It’s similar, Fraser said, to the benefits and challenges of automation, where ongoing tuning and management are key. “Human oversight is necessary to validate AI outputs, make critical judgments and respond effectively to evolving threats,” he said. “Security professionals can also provide critical thinking, contextual understanding and domain expertise to assess the accuracy and reliability of AI-generated information, which is essential to a successful strategy using ChatGPT and similar tools.”

Understanding the Benefits

While many companies at this point are more concerned about the threat from ChatGPT than they are invested in its potential as a defensive tool, Heide said that will inevitably shift as more and more users understand its potential. “I think as time goes on, and teams can see how quickly simple scripts can be completed to match an internal use case in a fraction of the time, they will begin to build more pipelines around its usage,” Heide said.

And as we move forward, Kawaguchi said, there’s an inevitable balancing act to be found between proceeding carefully in adopting generative AI and staying ahead of adversaries who may be surging forward with it. “It does feel relatively analogous to other step changes in technology that we’ve seen, where both offense and defense move forward and it’s a race to learn about new technology,” he said. “Our goal is to do so responsibly, so we’re taking it at an appropriate speed — but also not letting offense get ahead of us, not letting the malicious use of these technologies outpace just because we’re worried about potential misuse.”

Ultimately, Fraser said ChatGPT’s future as an asset for cyber defense will depend on responsible development, deployment, and regulation. “With responsible usage, ongoing advancements in AI and a collaborative approach between human experts and AI tools, ChatGPT can be a net benefit for cybersecurity,” he said. “It has the potential to significantly enhance defensive capabilities, support security teams in their fight against emerging threats, solve the skills gap through smarter automation, and enable a more proactive and effective approach to cyber defense.”

The post Better Security with ChatGPT: Using AI’s Defensive Strengths appeared first on The New Stack.

Is Open Source the Original Product-Led Growth?

Kim McMahon — Thu, 25 May 2023 14:32:59 +0000

Take a journey with me back to December 2022. I’m in job-hunting mode, and in interviews the term “PLG” comes up. I haven’t heard the term before, so after a quick Google search I learn that “PLG” stands for product-led growth, and it’s been around awhile. I read some articles and the more I learn about PLG, the more I realize that this is the open source software use model I’ve been working with for nearly a decade. Wow, I didn’t know it had a name!

To give you a little history, according to a blog post at OpenView Partners, the term “product-led growth” was originally coined in 2016 by Blake Bartlett at the venture capital firm, “although the principles that define it had been around before that.” It started between 2012 and 2014, when Bartlett saw that when promoting products, product-market fit was only part of the battle. We need to be obsessed with product distribution too.

“Great companies pay close attention to how to remove friction and turn their product into a marketing asset,” the blog post states.

Pivot Back to Open Source: Part 1 — Developers

I’ll be clear, open source software was not started to support a PLG model. Open source software has a set of benefits that we all know and live by.

What I am saying is that in the organizations I’ve worked at, bringing in users at the open source software level is a great first step in giving users hands-on experience with the technology.

Let me give you another reference point. Stephen O’Grady’s book, “The New Kingmakers: How Developers Conquered the World,” “explores the rise of the developer class, its implications and provides suggestions for navigating the new developer-centric landscape.” (If you haven’t read it, you should!) To summarize, developers are the most important asset organizations have. With the availability of open source and free versions of software, they go out and find the tool they need. They don’t ask, they just download and start using it. Or they take the technology they find, build upon it, contribute their enhancements (or not) to the open source project and use it to make their day-to-day tasks better.

Then organizations figure out they need to give these highly talented developers, DevOps teams and operators (collectively “practitioners”) the freedom to get what they need to do their job. If they don’t, these talented individuals will go to an organization that will let them do their job, and your organization will be stuck with the super-hard problem of replacing that talent.

So the practitioners in an organization are the people having a huge influence on the technologies an organization is using.

Pivot Back to Open Source: Part 2 — Technology Decisions

We know that with open source software, anyone can download and use it. They can fork it, contribute to it, deploy it in their environment. Use the software, ingrain it in their environment, and when it comes time for the CIO to say, “Hey we need something that does ,” the practitioner tells them they have they’ve been using, it solves problem, and they love it. The CIO says great, and the technology is blessed.

Then, maybe the CIO says, “Maybe we should get support with this so you can focus on your job and not have to keep this thing up to date in our infrastructure.” The practitioner says great but may be a little sad because they don’t get to use and contribute to open source; however, they also realize that there are other open source projects that are pretty nifty. The CIO calls the 1-800 sales number or fills out a form, and someone in sales does the paperwork to sell it to the organization.

How Is Open Source Like Product-Led Growth

This simplified example is product-led growth in a nutshell. The practitioner looks for a tool that will do something in a better way than the roundabout way they had been doing it. They find an open source project that works for what they need, and they become a fan of the technology. In PLG, the practitioner finds the free version of the software and … well, you get the picture.

Which brings me back to PLG. I was brought into my current organization because I know open source, I have worked with developer communities, and I know how to build a community of users. What I think about is education, creating content the practitioner audience cares about, giving individuals a sandbox to play in, having a clear and user-focused journey for getting started with the technology, and then figuring out where these individuals are getting their information and being in those places. Marketing 101.

In Summary

The practitioner audience — developers, operators, DevOps, DevSecOps and contributors — is the people we should focus on. They are the people for whom we are developing technology — to help make their cloud native environments easier to use, more secure, faster to get information out of — all of those and more. When we let go of control and put the technology decision in the hands of the users, people will choose to use something because it’s good.

Learn more about Cisco Open Source and join our Slack community to be part of the conversation!

The post Is Open Source the Original Product-Led Growth? appeared first on The New Stack.

How to Start a Software Project: A Guide for Junior Devs

David Eastman — Sat, 20 May 2023 14:00:35 +0000

“OK, let’s start coding!” However exciting these words are, they are far more comforting when it won’t be you who has to do all the work to kick everything off.

Consequently, starting a software project is a real divider between the experienced senior and the eager junior — and so I recommend that tyro devs get very familiar with all the areas that need to be covered, then have a go with a project that doesn’t have too many eyes on it. Many decisions can be delayed, and certain things can be trivially changed without any side effects, but some items are more expensive to alter later. This post is about where to start, and what bits are best to get right early on.

What Good Looks Like

The number one killer of all projects — even those that are not scrutinized in any way — is that their worth cannot be measured. Even little habits you start for yourself, like going to the gym or starting a diet, get quietly dropped if you see no measurable progress. In industry, unmeasurable projects may look good, but they have an inbuilt kill switch because you cannot prove how they add any value. Remember all those slightly annoying surveys you got in your inbox asking you questions about a website or service you just used? These are people making a solid effort to measure intangibles like “knowledge share.” However hokey it is, try to build a measurement track into your project from the start. There are various things to measure, from early adoption, to unique page views. Conversely, you can measure the decline in a bad thing your project is trying to prevent.

Keeping an up-to-date project wiki is the key to stopping early problems from spreading because of unclear aims. Write down what the project should achieve, the basic components you think are needed, who the stakeholders are and, yes, a few target dates. Novel situations will always occur, and people will do the wrong things — and frankly, some of the decisions you make early on will be faulty. None of these cause chaos. That is caused when people don’t share a strong enough idea of a project’s direction with anyone else. Just writing enough down is quite easy to do, and stems a lot of doubt.

The MVP and the Fakes

The first thing you produce or build should just be in the shape of what you need, but little else. If the product is a website, then it should match the format you want, but just be filled by a “Lorem ipsum” placeholder filler. What matters is that the MVP (minimum viable product) forces you to focus on the infrastructure you need to get in place.

Do you need access to an expensive service? Are you charging money? Is your service time-dependent in some way? In almost all cases you will need to fake one or more components of your project in order to test it, without invoking real-world costs or conditions.

Sometimes quite a bit of work is needed to create the fake environment, but that must be done. For example; while it is much easier to charge money than it used to be, we’ve all seen services that try to introduce charging only to discover that it isn’t so easy to plug in later (because of all the previous assumptions).

Services like AWS Lambda are very good for building cheap fakes, as they only charge when triggered. Fake data also needs to be considered. Testing on data that doesn’t match your product’s actual customer use will inevitably make for bad outcomes. A case in point was an institution that used obfuscated live data for testing. But the data was so heavily disguised that it destroyed the natural relationship between customers (for example, real people live together) and so it caused problems later.

Identities and Who Does What

One of those “hard to alter later on” decisions is forgetting to create email addresses, domain entries and accounts for your project, and instead doing these on your own account details, because you wanted to save time. Don’t do this. It doesn’t matter if you use a domain name or email address that doesn’t match the final identity — it matters that these are not connected to you or anyone else. Otherwise, the whole project goes on holiday when you do.

If you are fortunate enough to have help, then you need to split up the work into sensible portions. Fortunately, the agile methodology works very well for developers on starting projects — because at the beginning you have nothing but a list of tasks to be achieved. People can only take on the tasks if they understand them, which forces you to define them clearly. The same is true if you plan to use AI help — record whatever prompts you use. To start with, this is all you need. The agile mantra is:

Make it work, make it right, make it fast.

So start by making it work with whoever is onboard.

Environments and QA

If you start by understanding what to measure, and where to fake, you will probably find testing and Quality Assurance (QA) follow on naturally. You can use Jira or Trello to communicate with your testers, but whatever you choose should mesh with the tools you use to split up your stories and tasks. The world of containers has massively improved the chances that any environment you build in is pretty darn close to the environment your testers are using.

If you are behind a firewall, now is the time to make good friends with the security team. Otherwise, you will quickly find that you cannot share anything with any offshored testers.

When I say environment, I mean staging, QA and production. If you remove these terms for a moment, we are generally just talking about virtual computing spaces with different configurations. So for example, the QA environment allows your testers to play with the latest stable build and is configured to work with fake services. Scripting to create your environment will involve some type of playbook — make sure you have the skills available to do that.

Developer Tooling

How to actually write the code comes much lower down the priority list than you may have imagined, because it is much easier to setup and change. You can’t blame software developers for wanting to focus on frameworks, coding standards and editors — as that is our stock in trade. Most initial decisions can be altered later. In fact, rewriting your codebase should be something you aim to do at some point; it isn’t something to avoid altogether. But, like going to the toilet, just don’t wait until you have to.

The bigger IDEs tend to include dummy projects and lots of services that can help everyone start. Yes, they want to tie you into their other services, but their utility may be the difference between starting or not. The trick with using any highly integrated services from third-party companies is to make sure you have defined your architecture before you start, so that Microsoft (or whoever) doesn’t redefine your project to suit its tooling. Physical dependency is simple to change, mental dependency is a bit harder to shift.

If you are programming in the open, you will want to use git with Github for your central code repository. But in most cases, you will want to run private repositories with one of the many central repository services. If you know you will produce lots of slow-changing artifacts, then you may need an artifact repository (or DockerHub), and if you are dealing with lots of large files and non-text files (such as large images) then you may need to avoid git altogether and use something like PlasticSCM (which is now within Unity).

Setting up CI/CD

An example CI/CD pipeline; via dev.to

(Unless you are writing Go, don’t expect to see any blue gophers near your screen)

The center of your project will always be the build pipeline — the heart of Continuous Integration/Continuous Deployment (CI/CD). Simply put, you need to create a build from the appropriate code branch of your product or service and deploy it to one or more environments from a single signal. In other words, automate the deployment. This isn’t something you need immediately, but don’t do anything early on to prevent automation later.

Teams still use the open source favorite Jenkins to check out, build and deploy, but there are many other options. If you keep an eye on maintaining the configuration files that work for you, then changing the pipeline shouldn’t be too painful.

Once a basic build automation is in place, you can slot in other services — like code coverage and security testing.

Conclusion

So you’ve defined your project, worked out what good looks like, described what you think the components and processes should be, figured out the infrastructure, got the roles sorted out, checked in the first MVP and cranked the handle on the pipeline.

The important thing about projects is not how they start (no one will remember if all goes well), but how well they are maintained through their lifecycle. And yes, you also need to know when and how to retire them.

The post How to Start a Software Project: A Guide for Junior Devs appeared first on The New Stack.

AI Improves Developer Workflow, Says Gradle Dev Evangelist

Richard MacManus — Fri, 19 May 2023 17:00:57 +0000

Developer tools are scrambling to integrate AI into their products, no matter which part of the developer workflow they cater to. One example is Gradle Build Tool, an open source build automation tool that has been around for fifteen years now. The company behind it, Gradle Inc, has been paying particular attention to AI, since it will fundamentally change the concept it coined: Developer Productivity Engineering (DPE).

I spoke with Trisha Gee, the lead developer evangelist at Gradle, about how AI is impacting the developer workflow. Prior to joining Gradle at the beginning of this year, Gee had over two decades of experience as a developer — mostly focusing on Java.

AI Is Additive for Devs

Gee says that her view on AI for developers has evolved rapidly. Similar to other senior devs I know, she initially dismissed AI’s significance. However, she has since recognized it as a valuable tool for developers.

She now thinks of AI as an addition to the developer’s toolkit, rather than a replacement for them. Trisha compares the evolution of AI tools to the advent of internet search engines like Google back in the 1990s, which quickly became indispensable for developers when troubleshooting problems. Just as using Google and Stack Overflow has made coding more efficient, she thinks leveraging AI tools to generate code and seek answers to specific questions will be the same.

Gee emphasized, though, that developers must still rely on their own expertise and experience to filter AI-generated code and apply it appropriately within their codebase. She believes that AI can accelerate development by reducing the time spent on repetitive tasks — like writing boilerplate code — and enabling developers to focus on the bigger picture, such as ensuring the code meets business requirements.

How ML is Used in Testing

As well as AI code generation, machine learning is used in products like Gradle Enterprise, which aims to save developers’ time by automating time-consuming tasks and eliminating wasteful activities.

For instance, Gradle Enterprise offers features like “predictive test selection,” which uses machine learning to run tests impacted by code changes, instead of running the entire test suite. This approach improves efficiency by focusing on relevant areas, Gee said.

I asked whether there was a big impact on tools like Gradle because of the potential errors output by code generation tools like GitHub Copilot?

She replied that, yes, having tools that generate code means there is a need for effective testing to validate the generated code, which is where Gradle comes in. She highlighted the significance of running tests quickly and efficiently, identifying failures, and avoiding repetitive failures across teams that are using code generation tools. She added that Gradle Enterprise can contribute to developer productivity by automating aspects of the testing process, similar to how code generation automates code creation.

The goal is not to replace developers’ work but rather to alleviate them from mundane tasks, she said, allowing devs to focus on the business problem at hand, ensuring the tests are meaningful, and verifying that everything operates as expected.

Gee added that Gradle Enterprise also utilizes machine learning for tasks like gathering data on builds, tests, and the environments they run on. This data-rich context presents opportunities for leveraging AI and machine learning techniques, she said.

Career Development in AI Era for Young Devs

Given her experience in the industry, I wondered if Gee had any advice for young developers entering the industry currently, when AI is both a potential boon and (perhaps) an existential threat to developer careers.

Gee highlighted the importance of being adaptable and having a willingness to continuously learn. While there may be new skills to acquire, she said, it is not a major problem as long as developers possess the ability to learn and adapt.

She mentions git as being another example of a new type of skill that developers quickly had to adapt to, when it first came out.

“10 years ago, 15 years ago, when I was doing a lot of Java user group stuff with graduates in London, a lot of the graduates were panicking because they came out of university without understanding git,” she said. “And it’s a gap in their technical skill set, sure, but it’s a gap that you learn [to fill] on the job. You don’t need to understand everything about git during your training process. You learn that on the job, you see how other developers are using it, you understand what’s required of you in your team, in your business.”

Ultimately, she thinks that the learning process for new developers will involve acquiring new skills related to AI, similar to how they learn other skills — like using search engines or writing automated tests. So she sees AI as a natural part of the learning journey, rather than a significant shift in the skills required for a career in development.

Don’t Fear AI

Overall, Gee cautions against fear and fear-mongering about AI replacing developers’ jobs. She compares the use of AI tools to code generation features in IDEs, which were initially met with skepticism but are now widely embraced for their ability to make developers’ jobs easier. AI tools can be similarly helpful, she believes.

She added that she herself has used ChatGPT in development work, for thought organization and problem-solving. So it has already been a positive tool in her own job.

The post AI Improves Developer Workflow, Says Gradle Dev Evangelist appeared first on The New Stack.

Guardrails Can Be the Common Language to Bring Dev and Ops Together

Cortney Nickerson — Thu, 18 May 2023 14:04:50 +0000

With the rise of Kubernetes adoption and overall expansion of the cloud native landscape, DevOps certainly isn’t dead, but it is definitely changing. The rise of roles like platform engineer is clearly trying to address this strange adolescence that DevOps is going through in the cloud native era.

When adopting a platform as complex as Kubernetes, even the most polished and smooth-running DevOps pipelines have stumbled across gray areas in cloud native workflows that hadn’t been considered prior to the adoption of K8s, forcing teams to grow and adjust. New-world DevOps teams are beginning to take shape, many of which are being led by the ever-shining platform engineering star.

With a focus on empowering all stakeholders in this growth process, platform engineers and DevOps teams are challenged to find ways to break down the ever-existing silos between Devs and Ops to achieve a reliable, fast and frictionless delivery process.

While Devs and Ops might call different corners of the Kubernetes configuration life cycle home, they do share a common goal: Deploy the desired state so their applications run smoothly.

Unfortunately, this means they don’t necessarily share a common language. I’m not talking about how they talk on Zoom calls or in Slack messages to one another — there are already plenty of ways to navigate those language barriers. I’m talking about how these folks — developers, DevOps engineers and platform engineers — keep applying new trends in development culture, like shift left, while struggling to find cloud native tools that work best for their emergent blended workflows.

Often, chosen tools seem ideal for their perceived area of focus, like an IDE versus a cluster management platform, but all are simply creating more languages that are all trying to determine the same thing: whether or not the cluster and its applications are working as desired.

Language barriers in tooling creates cracks where mistakes start to slip in. And when the environment is complex and the cost of mistakes is high, as with Kubernetes, those with limited experience and those who can’t see the complete picture start to operate out of fear. To avoid introducing costly errors, they stop developing altogether.

What these blended teams need to do to shift left properly is create a common language for answering questions about desired state and cluster health.

Don’t worry, I’m not about to say the common language should be YAML.

The Complicated Landscape of Kubernetes Tooling

Despite the blending of tasks that comes with shift left, like moving testing and validation to the precommit phase rather than the tail end of the CI/CD pipeline, each person involved in the Kubernetes configuration life cycle arrives to work with different ideas about the right tool for their job. They still operate out of a siloed mindset.

Configuration developers and backend engineers, for example, do most of their work in an integrated development environment (IDE), like VSCode, using multiple plugins to create a comfortable environment for authoring in YAML, validating changes and interacting with git for collaboration. They likely don’t think their tool choice has any implication on those who pitch in on other tasks in the configuration life cycle, like cluster management, but they can’t see the full picture.

DevOps engineers are either asked to fix problems other people created, or rest firmly on the critical path to deployment, spending their days fielding questions from everyone else around the configuration life cycle instead of adding their high-value contributions. They need tools designed for collaboration but don’t want to waste time digging through Slack messages or git commits to understand where they could smooth out operations problems or add key optimizations.

Platform engineers are responsible for picking a dozen tools, developing the integrative “glue” with webhooks or APIs to get them all to work together smoothly, and then convincing everyone to hop aboard this new “paved road” experience in an internal development platform. For their development and engineering peers, this platform abstracts away all the complexity by providing a simple self-service/ClickOps experience. But behind the curtain, platform engineers are constantly working to improve internal development platforms by bringing on even more tools and writing more code to minimize conflicts between multiple languages.

Many larger organizations create a common language through platform engineering. It’s a perfectly valid and proven strategy; there are engineers on staff to keep the internal platform online. It’s not an easy move for startups or smaller outfits because of how difficult (and expensive) it can be to hire a successful platform engineering team on top of the application and DevOps engineers required to build applications and experiences.

Let’s consider another way. Instead of an abstracted internal development platform, there’s a common language that empowers people on all corners of the Kubernetes configuration life cycle: guardrails. By defining (and enforcing) what developers and engineers are not allowed to deploy, they have a common ground for learning new techniques, encouraging each other’s continuous education and deploying better software.

How Monokle Unifies Kubernetes Languages and Life Cycles

Monokle is a set of tools — Desktop, CLI and Cloud — that creates a single common language for the entire life cycle. Instead of each team member arriving with new tools and a different language about how to best create and maintain high-quality Kubernetes configurations, they can create blended workflows that don’t require a dozen tools magically working in sync.

Developers and configuration engineers can use Monokle Desktop to manage their day-to-day work on K8s YAML configurations, which is typically hidden on their laptops and IDEs until they’re finally ready to push their branch and create a pull request into a transparent platform for ongoing collaboration. As they work, Monokle’s guardrails features, like forms that eliminate those papercut-like YAML syntax errors or full-on Open Policy Agent (OPA) validation, prevent errors while instructing them how to improve their configurations in the future.

DevOps engineers can use Monokle’s CLI tooling to add those same features, especially Monokle’s custom validator engine, directly into their CI/CD pipelines for in-depth quality checks at every stage in the configuration life cycle. By doing so, errors are removed from the critical path by deferring questions to the validator, resulting in higher quality.

Platform engineers and other team leaders can leverage Monokle Cloud IDE for Policy Enforcement to define and implement the guardrails that both Desktop and CLI adhere to from a central location. By implementing clear guardrails, errors are prevented from reaching production in the first place. Platform leaders can define and create project-specific policies and rules to reflect business objectives, ensure performance, meet security and compliance criteria, and maximize Git workflows by integrating policy validation into every PR, ultimately achieving consistent, high-quality deployments in less time and with fewer resources.

No matter what version of Monokle these folks might use in their day-to-day work, they’re leveraging the same guardrail-enabling features to achieve common goals in the blended workflows created by shift-left culture:

Forms and templates simplify the way developers write their YAML configurations, skipping the frustration of hunting for simple syntax errors that stop deployments in their tracks.

Real-time validation helps the most knowledgeable Kubernetes developers on the team to establish best practices and must-follow policies for YAML syntax, OPA rules and the Kubernetes schema itself. With custom rules, DevOps and platform engineers can prevent vulnerable or misconfigured code from even being committed to their git repository in the first place, the purest outcome of the shift-left paradigm.

Resource comparisons between local and cluster resources, or the various git branches that define a cluster’s desired state, for anyone (not just DevOps engineers) who needs to quickly identify and understand the impact of proposed changes as it moves from a development and production cluster. With a line-by-line diff, anyone can catch errors like a mistakenly changed secret or resource limit change that would affect costs.

A Git-based foundation, where all changes are written into commits to be pushed to the git provider of choice, which ensures all roles and departments can review and collaborate on what others are doing. There are no silos, hidden scripts or questions about who made what changes and when.

A cluster mode dashboard with observability features, logs and terminal access, recent alerts and more. When many organizations restrict monitoring and troubleshooting work exclusively to DevOps engineers, Monokle makes this information available to anyone, another example of democratizing the educational value of having a common language.

The Path to Deploying Your First Guardrail with Monokle

If you’ve felt like the developers and engineers around you speak different languages, no matter which corner of the Kubernetes configuration life cycle you call home, guardrails might be the common language you’ve needed all along. Successfully deploying applications to Kubernetes is an all-hands-on-deck effort, so why shouldn’t your tools accommodate and encourage collaboration and quality control from the very first line of YAML?

Here are a few guardrails to get started with Monokle:

If you want to validate your configurations and visualize state in just a few minutes, and entirely for free, get started with Monokle Cloud IDE for Policy Enforcement by signing in and connecting your GitHub account and repository.

To establish best practices via validators and cluster management, download Monokle Desktop or the CLI tooling, both of which are free and open source.

With shift left blurring pre- versus post-deployment tasks, guardrails are the most transparent path toward creating collaborative workflows, folding education into every commit and deploying high-quality releases to ensure that everyone is speaking the same language.

The post Guardrails Can Be the Common Language to Bring Dev and Ops Together appeared first on The New Stack.

Boost DevOps Maturity with a Data Lakehouse

Guido Deinhammer — Wed, 17 May 2023 17:20:53 +0000

In a world riven by macroeconomic uncertainty, businesses increasingly turn to data-driven decision-making to stay agile.

That’s especially true of the DevOps teams tasked with driving digital-fueled sustainable growth. They’re unleashing the power of cloud-based analytics on large data sets to unlock the insights they and the business need to make smarter decisions. From a technical perspective, however, that’s challenging. Observability and security data volumes are growing all the time, making it harder to orchestrate, process, analyze and turn information into insight. Cost and capacity constraints are becoming a significant burden to overcome.

Data Scale and Silos Present Challenges

DevOps teams are often thwarted in their efforts to drive better data-driven decisions with observability and security data. That’s because of the heterogeneity of the data their environments generate and the limitations of the systems they rely on to analyze this information.

Most organizations are battling cloud complexity. Research has found that 99% of organizations have embraced a multicloud architecture. On top of these cloud platforms, they’re using an array of observability and security tools to deliver insight and control — seven on average. This results in siloed data that is stored in different formats, adding further complexity.

This challenge is exacerbated by the high cardinality of data generated by cloud native, Kubernetes-based apps. The sheer number of permutations can break traditional databases.

Many teams look to huge cloud-based data lakes, a repository that stores data in its natural or raw format, to help teams centralize disparate data. A data lake enables teams to keep as much raw, “dumb” data as they wish, at relatively low cost, until teams in the business find a use for it.

When it comes to extracting insight, however, data needs to be transferred to a warehouse technology so it can be aggregated and prepared before it is analyzed. Various teams usually then end up transferring the data again to another warehouse platform, so they can run queries related to their specific business requirements.

When Data Storage Strategies Become Problematic

Data warehouse-based approaches add cost and time to analytics projects.

As many as tens of thousands of tables may need to be manually defined to prepare data for querying. There’s also the multitude of indexes and schemas needed to retrieve and structure the data and define the queries that will be asked of it. That’s a lot of effort.

Any user who wants to ask a new question for the first time will need to start from scratch to redefine all those tables and build new indexes and schemas, which creates a lot of manual effort. This can add hours or days to the process of querying data, meaning insights are at risk of being stale or are of limited value by the time they’re surfaced.

The more cloud platforms, data warehouses and data lakes an organization maintains to support cloud operations and analytics, the more money they will need to spend. In fact, the storage space required for the indexes used to support data retrieval and analysis may end up costing more than the data storage itself.

Further costs will arise if teams need technologies to track where their data is and to monitor data handling for compliance purposes. Frequently moving data from place to place may also create inconsistencies and formatting issues, which could affect the value and accuracy of any resulting analysis.

Combining Data Lakes and Data Warehouses

A data lakehouse approach combines the capabilities of a warehouse and a lake to solve the challenges associated with each architecture, thanks to its enormous scalability and massively parallel processing capabilities. With a data lakehouse approach to data retention, organizations can cope with high-cardinality data in a time- and cost-effective manner, maintaining full granularity and extra-long data retention to support instant, precise and contextual predictive analytics.

But to realize this vision, a data lakehouse must be schemaless, indexless and lossless. Being schema-free means users don’t need to predetermine the questions they want to ask of data, so new queries can be raised instantly as the business need arises.

Indexless means teams have rapid access to data without the storage cost and resources needed to maintain massive indexes. And lossless means technical and business teams can query the data with its full context in place, such as interdependencies between cloud-based entities, to surface more precise answers to questions.

Unifying Observability Data

Let’s consider the key types of observability data that any lakehouse must be capable of ingesting to support the analytics needs of a modern digital business.

Logs are the highest volume and often most detailed data that organizations capture for analytics projects or querying. Logs provide vital insights to verify new code deployments for quality and security, identify the root causes of performance issues in infrastructure and applications, investigate malicious activity such as a cyberattack and support various ways of optimizing digital services.
Metrics are the quantitative measurements of application performance or user experience that are calculated or aggregated over time to feed into observability-driven analytics. The challenge is that aggregating metrics in traditional data warehouse environments can create a loss of fidelity and make it more difficult for analysts to understand the relevance of data. There’s also a potential scalability challenge with metrics in the context of microservices architectures. As digital services environments become increasingly distributed and are broken into smaller pieces, the sheer scale and volume of the relationships among data from different sources is too much for traditional metrics databases to capture. Only a data lakehouse can handle such high-cardinality data without losing fidelity.
Traces are the data source that reveals the end-to-end path a transaction takes across applications, services and infrastructure. With access to the traces across all services in their hybrid and multicloud technology stack, developers can better understand the dependencies they contain and more effectively debug applications in production. Cloud native architectures built on Kubernetes, however, greatly increase the length of traces and the number of spans they contain, as there are more hops and additional tiers such as service meshes to consider. A data lakehouse can be architected such that teams can better track these lengthy, distributed traces without losing data fidelity or context.

There are many other sources of data beyond metrics, logs, and traces that can provide additional insight and context to make analytics more precise. For example, organizations can derive dependencies and application topology from logs and traces.

If DevOps teams can build a real-time topology map of their digital services environment and feed this data into a lakehouse alongside metrics, logs and traces, it can provide critical context about the dynamic relationships between application components across all tiers. This provides centralized situational awareness that enables DevOps teams to raise queries about the way their multicloud environments work so they can understand how to optimize them more effectively.

User session data can also be used to gain a better understanding of how customers interact with application interfaces so teams can identify where optimization could help.

As digital services environments become more complex and data volumes explode, observability is certainly becoming more challenging. However, it’s also never been more critical. With a data lakehouse-based approach, DevOps teams can finally turn petabytes of high-fidelity data into actionable intelligence without breaking the bank or becoming burnt out in the effort.

The post Boost DevOps Maturity with a Data Lakehouse appeared first on The New Stack.

Is DevOps Tool Complexity Slowing Down Developer Velocity?

Heather Joslyn — Wed, 17 May 2023 13:29:24 +0000

The overwhelming majority of developers in a new survey — 84% — say they’re involved in DevOps activities. But despite this, devs haven’t gotten any faster at making code changes and putting them into production over the past two and a half years.

The increasing complexity of the projects DevOps teams work on may be slowing down developer velocity, suggested the report by SlashData and the CD Foundation, a project of the Linux Foundation.

Among the findings:

Lead times to restore service after an outage have increased. According to the report, it takes up to a day for 44% of DevOps practitioners to restore services compared to 54% who said the same since data collection began in the third quarter of 2020.
The time needed to implement code changes has not improved. Thirty-seven percent of DevOps practitioners said their lead time for code changes is less than a week. That’s the same percentage that gave that answer in Q3 2020.
The proportion of organizations that the researchers define as low performers has increased since Q3 of 2020, while the share of high performers has decreased in that same period. This fits a pattern we have written about previously.

“It is good to see that there is still an increase in the adoption of CD and DevOps practices. However, there are signs that we still have work to do,” said Fatih Degirmenci, executive director of the CD Foundation, in an email response to The New Stack.

“CD and DevOps require organizations to change how they organize themselves, get their teams to embrace the cultural and mindset changes and adopt their product structures,” Degirmenci wrote, adding, “These changes usually take some time to implement and show their effects.”

He also noted the impact of the complexity involved, including “not just how the products are built (e.g. microservices) but also the surrounding environment — from infrastructure to development environments to CD technologies, and so on.”

The new report’s findings are derived from data collected for SlashData’s past six Developer Nation surveys, which reached more than 125,000 respondents from the third quarter of 2020 to the first quarter of 2023.

Use of Self-Hosted CI/CD Tools Declined

The study’s results showed how organizations are using DevOps technologies. Not much has changed since Q1 of 2022, including the average number of tools used, which has held steady at 4.5.

However, some trends emerged:

The use of self-hosted CI/CD tools has dropped from 32% in Q1 2022 to 23% in Q1 2023.
Application monitoring/observability tools saw the second biggest decline in usage, going from 37% to 31% in the same period.
Few areas of security saw increased use. The largest increase was for application security testing technologies, which increased from 25% to 28% over the past year.

CI/CD tools appear to be a decisive factor in how quickly it takes organizations to restore service: Fifty-eight percent of those that use CI/CD tools can restore within a day, versus 35% of those that don't.

The more CI/CD technologies organizations use, the less time it takes to restore. For example, it takes more than a week to restore services for about 55% of DevOps practitioners, but that steadily drops for each additional tool, going to approximately 20% among those using eight or more tools. But there’s a caveat: the more self-hosted tools developers use, the longer it takes to restore service after an incident.

More than half of organizations that use five or more self-hosted tools were defined as low performers by the researchers, with only 10% of organizations that use that many self-hosted tools are considered high performers.

“An increasing number of tools used having such a strongly negative impact on service restoration time has multiple possible explanations,” the report stated. “However, interoperability issues may be at the centre of many of them. Multiple tools may make it challenging to integrate all of them well, leading to a greater challenge to isolate the service-impacting issue at hand.

“Further, a lack of standardisation between tools may make it more difficult for all tools to work together well.”

As teams choose from among the ever-expanding landscape of DevOps tools, they should consider how well the tools they use play together, Degirmenci suggested.

“One thing organizations could do and benefit from,” he wrote The New Stack, “is including interoperability as a criteria during their technology evaluations, so they can reduce the complexity as well as reach greater flexibility when it comes to adding new tools to their environments.”

The post Is DevOps Tool Complexity Slowing Down Developer Velocity? appeared first on The New Stack.

GitOps as an Evolution of Kubernetes

Steven J. Vaughan-Nichols — Tue, 16 May 2023 15:31:11 +0000

VANCOUVER, British Columbia — Many people talk about GitOps and Kubernetes, but when Brendan Burns, a Microsoft Corporate Vice President, a Distinguished Engineer at Microsoft Azure, and, oh yeah, co-founder of Kubernetes, talks, I listen. Burns spoke at The Linux Foundation’s GitOpsCon about how GitOps is an evolutionary step for Kubernetes.

How? Burns started by explaining how it’s deeply rooted in the development of continuous integration, deployment, and delivery. What really motivated him to help create Kubernetes was, “When we were starting out, we tried to put together reliable deployments. They worked on this using the DevOps tools of the time with a mixture of Puppet, Chef, Salt, and Ansible — and Bash obviously — it worked about 85% of the time. And then you’d massage it, and it eventually would work maybe 95% of the time: However, the journey was often fraught with difficulties and uncertainties, which birthed the idea of Kubernetes.

Kubernetes’ inception was essentially a response to the arduous and unreliable nature of the deployment process. It was a fusion of the DevOps challenges and the innovative strides Docker made in the container revolution. Docker’s focus on hermetically sealing and packaging applications was a vital prerequisite to reimagining how deployments could be executed. Over the past decade, this approach has transformed into the standard modus operandi within the tech community.

Advent of GitOps

But the tech world has now moved a step further with the advent of GitOps. It’s no longer aimed at redefining the deployment process itself. It is no longer just about leading into the deployment that Kubernetes orchestrates but the entire journey — from sourcing configurations to deploying them into the world where Kubernetes can utilize them.

GitHub, with its declarative configuration, now plays a pivotal role in ensuring reliable delivery and contributes to the ongoing evolution of the community. “While it’s universally accepted now,” said Burns, “the idea was a subject of contention at the time.” Scripting was rampant. Notably, the CI/CD pipeline, even when described in YAML, was an imperative program execution. Burns thinks GitOps, with its inherent declarative nature, is a welcome reinforcement to the Kubernetes ecosystem.

Moreover, empowering people to do more was another central theme of our initial thought process. The goal was to alleviate the burdens that plagued developers daily. This, in essence, is the journey of the community — from its inception rooted in deployment and continuous delivery to the present day, where GitOps reigns, offering a more reliable, declarative, and user-empowering approach to managing deployments.

It does this in several ways:

Separation of Concerns: With Kubernetes and GitOps, teams can be compartmentalized, focusing on specific tasks and responsibilities. This clean delineation can help avoid confusion, improve efficiency, and make it clear where one team’s responsibilities end, and another begins.
Multiple Personas: In modern software development, there are many personas involved, such as developers, platform engineers, and security teams. Each has a specific role and responsibilities, and all need to work together in the same environment.
GitOps as a Solution: GitOps can help manage this complex environment. It allows each persona to manage a Git repository, rather than needing to directly interact with the cluster. This can reduce the risks associated with one group having too much control and can make it easier for teams to work together. It essentially allows for a clearer division of labor and less risk of overlap or conflict.
Automated Updates: GitOps can also facilitate automatic updates. Tools such as Dependabot can monitor repositories and propose updates when necessary. This process reduces the time and effort required to stay up to date, increasing efficiency and reducing the risk of falling behind on important updates.
Security and Compliance: GitOps also supports better security and compliance. Through a well-managed Git repository, it can ensure that every change is tracked and auditable, which is important for meeting compliance requirements.

The GitOps workflow and its intersection between platform engineering and the developer is particularly significant for programmers who prefer not to be bogged down by the intricacies of deploying their code into Kubernetes. Irrespective of their preferred programming language — be it Java, Python, Dotnet, Rust, or Go — they simply want to push their code, generate a container image, and have it deployed immediately. GitOps enables them to do this.

Scalability

Burns continued, the beauty of GitOps lies in its scalability. Developers need not be overly concerned with the number of clusters in their organization or their specific locations. The shift from a push model of pipelines to a GitOps pull model allows a level of abstraction where the number of clusters becomes somewhat irrelevant. Developers only have to deal with a Git repository. If a new cluster emerges or an old one disappears, developers may not even notice.

The consistency of the workflows remains even when transitioning from early pre-production to staging to production in the application lifecycle. This decreases the cognitive load on developers, allowing them to concentrate more on their code rather than where it goes post-deployment.

Thus, in GitOps, the Git repository becomes the ultimate source of truth, and the platform engineering team can concentrate on initializing that Git repository, thus empowering developers to efficiently deploy their code.

Burns also reminded us that historically, the concept of “snowflakes” (One-off unique servers impossible to reconstruct if they “melted”) was a cause of concern. True, containers and orchestration eliminated this problem at the individual container level. However, we now face the issue of “snowflake clusters” — clusters of machines that are uniform internally but differ from others.

GitOps, Burns said, offers a robust solution for this issue. The shift from a push to a pull model makes GitOps relatively indifferent to the scale or number of clusters. Each cluster is configured to point to the same Git repository. When you make the Git repository initialization part of creating clusters, it automatically creates clusters that are initialized with the correct software versions.

Thus, this process ensures consistency across the platform. For example, it also eliminates the chances of forgetting to include a cluster in a pipeline that deploys a new version of security software or having to inform a development team about changes in regions. This consistency and reliability are among the main advantages of GitOps.

Interestingly, the application of GitOps is not restricted to Kubernetes but extends to public cloud resources through service operators. Users are leveraging the Kubernetes control plane to manage containerized resources and instances of a Postgres database or blob storage system. GitOps can manage resources within your cluster as well as those in the cloud, thus widening its scope and utility.

No Be-All, End-All

However, GitOps is not the be-all and end-all solution There’s a place for both CI/CD pipelines and GitOps, “It’s not a fight, but rather it’s two very complementary technologies, one that is very good at easily making the state real and one that is very good at orchestrating stages of what you want the world to look like.”

Drawing parallels with robotics, which Burns worked on before he came to software, there’s a constant handoff between control and planning, one can understand the relationship between traditional CI/CD pipeline systems and GitOps. GitOps is like a controller, quickly making a state reality, but it’s not ideal for software rollouts on a global scale that requires slow, gradual deployments. This is where traditional CI/CD systems, or “planners,” come into play.

So, Burns concluded, CI/CD pipelines and GitOps each have their strengths — GitOps in bringing a specific state into reality with ease, and traditional CI systems in orchestrating stages of what the world should look like. Understanding the value of GitOps in the container context and its interplay with traditional CI systems can significantly enhance efficiency and productivity. And all, of course, will work well in a Kubernetes-orchestrated world.

The post GitOps as an Evolution of Kubernetes appeared first on The New Stack.

4 Core Principles of GitOps

Alex Williams — Thu, 11 May 2023 21:17:02 +0000

It’s at the point where GitOps is getting enough notice that a brief on its principles is appropriate.

Last year, the OpenGitOps community released GitOps Principles 1.0. There’s general support for GitOps and many competing interests in developing GitOps engines, such as with Argo and Flux, two graduate projects from the Cloud Native Computing Foundation. But focusing on principles lets everyone know what GitOps is and, even more so, helps define what it is not.

OpenGitOps defines GitOps as a set of principles for operating and managing software systems, wrote open source software engineer Scott Rigby. “When using GitOps, the Desired State of a system or subsystem is defined declaratively as versioned, immutable data, and the running system’s configuration is continuously derived from this data. These principles were derived from modern software operations but are rooted in pre-existing and widely adopted best practices.”

With DevOps, operations and development teams collaborate on their own chosen tools. GitOps provides a declarative approach and complements DevOps. It allows for application delivery and cluster management. GitOps shares the same concepts as Kubernetes, making it easier for teams already working with it to adapt.

The cdCON + GitOpsCon conference, held this week in Vancouver, BC, featured a presentation about the GitOps principles by Rigby, and Christian Hernandez, a senior principal product manager with Red Hat.

Here are a few takeaways, from this talk and others at the conference:

Principle #1: GitOps Is Declarative.

A system managed by GitOps must have a desired state expressed declaratively.

GitOps allows for automating security practices, said Eve Ben Ezra, a software engineer with The New York Times, who spoke at the cdCON + GitOps Con event. DevOps encourages collaboration, which means incorporating security into every stage of the software development lifecycle.

The comparison to security practices dovetails with the second principle of GitOps:

Principle #2: GitOps Apps Are Versioned and Immutable

The desired state is stored in a way that enforces immutability, versioning, and complete version history.

The general viewpoint: rollbacks should be simple.

“You go a step further with GitOps providing an auditable trail of all changes to infrastructure and applications,” Ezra said.

Versioning allows organizations to find gaps in their security and also allows for testing and declarative infrastructure, Ben Ezra said. Using tools like the Open Policy Agent, an open source project for establishing authorization policies across cloud native environments, allows for more productivity because once it’s automated, teams spend less time agonizing over whether or not their infrastructure is compliant, which gives them more time for innovation and feature development.

“While automation is an important part of DevOps, it’s by no means the only methodology also calls for cross-functional collaboration and the breaking down the silos and knowledge sharing across an org,” Ben Ezra said. “GitOps builds on these principles of leveraging automation and infrastructures as code to reduce configuration drift, and providing a single source of truth for an entire team or org.

By writing it down, all team members can contribute to the infrastructure code, which promotes shared responsibility across the entire software development lifecycle. Just as importantly, everyone is aware of these changes, so they can speak up if they see something you missed.”

“For example, at the New York Times, we’ve leveraged utilities from OPA to improve feedback and developer productivity within GitOps operational framework,” Ben Ezra said.

Principle #3: GitOps Apps Are Pulled Automatically

Software automation automatically pulls the declared state declaration from the source.

Take Flux, for example. Flux is a set of continuous and progressive delivery solutions for Kubernetes that are open and extensible, enabling GitOps and progressive delivery for developers and infrastructure teams.

“So Flux is a set of continuous delivery tools that are focused on security, velocity and reliability,” said Priyanka Pinky Ravi, a developer experience engineer with Weaveworks, in a keynote with other engineers from Cloud Native Computing Foundation graduated projects.

“And they are focused on automation. So the idea is that you have this setup Kubernetes controllers that you install onto your cluster, and they’re running on a reconciliation loop which is just an interval that you set. And every time that runs, the source controller will go in and pull from whatever source you said, such as a git repository, home or repo Image, Image registry, OCI registry. And the idea is that it pulls the manifests that it finds there and then actually applies them onto your cluster.”

Principal #4: GitOps Apps Are Continuously Reconciled

“Software agents continually observe the actual system state and attempt to apply the desired state.”

Of note are the different views that reflect the GitOps community. But with a set of principles, the community can build approaches that reflect the core focus of what GitOps is and what it is not; at least, that’s the concept.

Start reading the discussions on GitHub; there are still issues to clarify, even when explaining the meaning of pull.

And there’s more on the Twitter threads. Enjoy.

When
1. An operator pushes a button on GUI to change the running system state directly without updating the desired state
2. Failures leading to drift from the desired state

Now either
A) an agent changes the system back to correct state
Or B) system is in an unknowable state

— Alexis Richardson (@monadic) May 11, 2023

1) Any pipeline not stored in git (just setup in a ui for example)
2) A pipeline that doesn’t deal with desired state at all, (just running a build, testing an image, gathering metrics, doing a data transformation, updating a database etc)
3) Any pipeline operating without…

— Dan Garfield (@todaywasawesome) May 11, 2023

The post 4 Core Principles of GitOps appeared first on The New Stack.

Runtime Security: Relevancy Is What Counts

B. Cameron Gain — Thu, 11 May 2023 12:00:15 +0000

Security experts as well as many — if not most — developers and software engineers know that an organization deploying software is almost inevitably working with insecure code. Code and applications are often rife with vulnerabilities throughout the CI/CD process. Other stakeholders, such as the CTO, might have at least some inkling of the status quo but they may or may not know the severity or the magnitude of vulnerabilities and how software even at runtime can be rife with vulnerabilities.

Security best practices have emerged, including those for cloud native deployments. However, that remains a work in progress. Typically, SBOMs and signatures and other security best practices continue to improve. However, detecting and remediating vulnerabilities in code and applications that are deployed remains a work in process. Intuitively, it may seem rational to rely on a common vulnerability scoring system (CVSS) to prioritize the severity of vulnerabilities. This helps to somewhat reduce the whack-a-mole approach to detecting and removing vulnerabilities throughout CI/CD and during deployment. However, this too, can fall short.

This is where a different approach appears more applicable and relevant and of course time-saving. When also combined with automation this process evolves relevancy defined in part as severe vulnerabilities that will remain in the container and code during runtime. The trick is, of course, to automate the pinpointing of these vulnerabilities and which ones are the most relevant, based on the overlap of their severity, flexibility and especially, relevancy.

Prioritization

Ideally, vulnerabilities would be prioritized by the likelihood and severity of future revenue impact — which is similar to how traditional project management is prioritized, Torsten Volk, an analyst for Enterprise Management Associates (EMA), said. A container might include a Python library infected with ransomware, but Volk said this could be irrelevant if:

The application code running in this container does not actually use the infected library.
Strict container networking policies block malware from accessing the ports.
The container runs on a Kubernetes cluster without access to the types of data sources targeted by the malware program.
The malware needs access to a highly privileged account to be able to spread, while the container runs a bare-bones account that lacks the required privileges.
The container does have access to a vulnerable data source, but the data source only contains cafeteria menus and the score sheets of the corporate software team.

“Even these few examples demonstrate that a successful attack heavily depends on the context of its target,” Volk said. “However, identifying the relevant context factors and prioritizing vulnerabilities accordingly is where the magic lies.”

It is also important to take into account that Kubernetes is still a relatively young and fast-evolving technology, Oshrat Nir head of product marketing, ARMO told The New Stack. “While it has started to plateau the talent gap still exists. Pairing that with the current macroeconomic climate means that DevOps or platform teams are and will continue to be short-staffed, yet they have more jobs to be done than ever before,” Nir told The New Stack. “As a result, prioritization has become more important than ever.

This goes double for security, Nir said: “A major security breach can taint an organization’s reputation for a long time, making the hit to the bottom line something that takes longer to repair than the breach itself.”

Relevancy

Kubernetes security tool provider ARMO says it has released under beta a new capability with eBPF: vulnerabilities relevancy and prioritization. Relevancy and prioritization allow ARMO Platform and Kubescape users to deprioritize vulnerabilities that belong to unused software packages and components. By first deprioritizing vulnerabilities that are less critical, users can focus on addressing the ones that pose a greater threat to their running cluster.

This release is also important given that, on average, it takes weeks or even longer to apply fixes to security. “As a result, it would behoove DevSecOps practitioners to fix the vulnerabilities that expose them the most first. The problem is that most scanners return a (long) list of vulnerabilities to the users with little to no indication of what to fix first,” Nir said. “This will often leave teams paralyzed and planning the work of patching, testing and deploying the patch can take weeks.”

The typical way of sorting through vulnerabilities is their criticality, Nir explained. “The thing is that many software packages in containers aren’t even used at runtime, which means they pose less risk than their criticality would lead us to believe. This goes double now that hackers, knowing about this best practice, actually try to infiltrate with the more innocuous, less critical vulnerabilities, Nir said.

ARMO’s relevancy feature pinpoints the vulnerabilities that should be prioritized to be fixed, Nir said. “While it includes fixability, criticality and the ability to access the vulnerability remotely (i.e. code injection or remote code execution) it also factors in whether the security packages are actually in use,” Nir said. “In this way, security teams can filter 60%-80% of vulnerabilities out of their immediate to-do list and focus on the things that need to be solved first.”

The post Runtime Security: Relevancy Is What Counts appeared first on The New Stack.

Mirantis Updates k0s Lightweight Kubernetes Distro

Steven J. Vaughan-Nichols — Wed, 10 May 2023 15:00:41 +0000

Mirantis, the Docker and Kubernetes developer company, has released the latest version of its lightweight, open source Kubernetes distribution, k0s. The new version boasts compatibility with the brand-new Kubernetes 1.27 release with various other improvements and bug fixes.

Back to Basics

K0s, for those that don’t know it, is one of several stripped-down, back-to-basics Kubernetes distros. Others include Minikube, k3s, and MicroK8s. While they all have their differences, the name of the game is to give developers the power to create Kubernetes clusters on low-end hardware. For example, K0s can run on as little as a single CPU and 1GB RAM for a single node.

The updated Mirantis k0s distribution significantly simplifies the installation and management process of Kubernetes clusters. One of the key enhancements includes support for containerd plug-ins, such as WebAssembly (WASM) and gVisor container sandboxes. This enhancement simplifies the running of these containers It also enables users to extend their clusters with additional container runtimes effortlessly.

Furthermore, to eliminate custom forks of project components and to ensure greater compatibility upstream Kubernetes functionality, Mirantis now provides its own system images, which in turn reduces complexity and improves security.

For one thing, many upstream Kubernetes system images contain Common Vulnerabilities and Exposures (CVE). For instance, Miska Kaipiainen, Mirantis VP Engineering, Strategy & Open Source Software, states that “If you scan a kube-proxy image at registry.k8s.io/kube-proxy:v1.25.8, you’ll see 12 vulnerabilities reported (or some other number, depending on the scanner you use).” Sure, many of these CVEs, such as old curl binaries and libs in the container, aren’t used at runtime. But you never know when that “harmless” CVE might turn out to be exploitable. So Mirantis takes full control of k0s images built with pure upstream functionality and doesn’t rely on any custom forks of project components.

The result? “As of this writing, system images shipping with k0s 1.27 come with zero (0) – yes, zero – known vulnerabilities. We have daily scanning in place, which lets us keep track of vulnerabilities as they pop up and mitigate them super-quickly.”

CNCF Certified

A Cloud Native Computing Foundation (CNCF)-certified Kubernetes distribution, k0s, is versatile enough to run on any Linux-based operating system, making it suitable for large-scale data center deployments, lightweight edge clusters, laptops, and even Raspberry Pi. K0s is distributed as a single binary and can be installed on any node from the internet with a single command.

For ease of management, platform deployment, and scaling can be administered locally via the k0s command line interface (CLI) and remotely via the k0sctl utility using configuration files. The built-in k0s Autopilot enables you to manage updates automatically. Additionally, operators can access k0s clusters via kubectl, Lens Desktop, and other standard Kubernetes CLIs and dashboards.

So, if you want a safe, lightweight Kubernetes for your work, play, or study, I’d give K0s a try. It’s a nice little distro.

The post Mirantis Updates k0s Lightweight Kubernetes Distro appeared first on The New Stack.