Software Development Overview, News & Trends | The New Stack

An E-tailer’s Journey to Real-Time AI: Recommendations

Aaron Ploetz — Thu, 15 Jun 2023 17:41:53 +0000

The journey to implementing artificial intelligence and machine learning solutions requires solving a lot of common challenges that routinely crop up in digital systems: updating legacy systems, eliminating batch processes and using innovative technologies that are grounded in AI/ML to improve the customer experience in ways that seemed like science fiction just a few years ago.

To illustrate this evolution, let’s follow a hypothetical contractor who was hired to help implement AI/ML solutions at a big-box retailer. This is the first in a series of articles that will detail important aspects of the journey to AI/ML.

The Problem

First day at BigBoxCo on the “Infrastructure” team. After working through the obligatory human resources activities, I received my contractor badge and made my way over to my new workspace. After meeting the team, I was told that we have a meeting with the “Recommendations” team this morning. My system access isn’t quite working yet, so hopefully IT will get that squared away while we’re in the meeting.

In the meeting room, it’s just a few of us: my manager and two other engineers from my new team, and one engineer from the Recommendations team. We start off with some introductions, and then move on to discuss an issue from the week prior. Evidently, there was some kind of overnight batch failure last week, and they’re still feeling the effects of that.

It seems like the current product recommendations are driven by data collected from customer orders. With each order, there’s a new association between the products ordered, which is recorded. When customers view product pages, they can get recommendations based on how many other customers bought the current product alongside different products.

The product recommendations are served to users on bigboxco.com via a microservice layer in the cloud. The microservice layer uses a local (cloud) data center deployment of Apache Cassandra to serve up the results.

How the results are collected and served, though, is a different story altogether. Essentially, the results of associations between products (purchased together) are compiled during a MapReduce job. This is the batch process that failed last week. While this batch process has never been fast, it has become slower and more brittle over time. In fact, sometimes the process takes two or even three days to run.

Improving the Experience

After the meeting, I check my computer and it looks like I can finally log in. As I’m looking around, our principal engineer (PE) comes by and introduces himself. I tell him about the meeting with the Recommendations team, and he gives me a little more of the history behind the Recommendation service.

It sounds like that batch process has been in place for about 10 years. The engineer who designed it has moved on, not many people in the organization really understand it, and nobody wants to touch it.

The other problem, I begin to explain, is that the dataset driving each recommendation is almost always a couple of days old. While this might not be a big deal in the grand scheme of things, if the recommendation data could be made to be more up to date, it would benefit the short-term promotions that marketing runs.

He nods in agreement and says that he’s definitely open to suggestions on how to improve the system.

Maybe a Graph Problem?

At the onset, this sounds to me like a graph problem. We have customers who log on to the site and buy products. Before that, when they look at a product or add it to the cart, we can show recommendations in the form of “Customers who bought X also bought Y.” The site has this today, in that the recommendations service does exactly this: It returns the top four additional products that are frequently purchased together.

But we’d have to have some way to “rank” the products, because the mapping of one product to every other purchased at the same time by any of our 200 million customers is going to get big, fast. So we can rank them by the number of times they appear in an order. A graph of this system might look something like what is shown below in Figure 1.

Figure 1. A product recommendation graph showing the relationship between customers and their purchased products.

After modeling this out and running it on our graph database with real volumes of data, I quickly realized that this isn’t going to work. The traversal from one product to nearby customers to their products and computing the products that appear most takes somewhere in the neighborhood of 10 seconds. Essentially, we’ve “punted” on the two-day batch problem, to have each lookup putting the traversal latency precisely where we don’t want it: in front of the customer.

But perhaps that graph model isn’t too far off from what we need to do here. In fact, the approach described above is a machine learning (ML) technique known as “collaborative filtering.” Essentially, collaborative filtering is an approach that examines the similarity of certain data objects based on activity with other users, and it enables us to make predictions based on that data. In our case, we will be implicitly collecting cart/order data from our customer base, and we will use it to make better product recommendations to increase online sales.

Implementation

First of all, let’s look at data collection. Adding an extra service call to the shopping “place order” function isn’t too big of a deal. In fact, it already exists; it’s just that data gets stored in a database and processed later. Make no mistake: We still want to include the batch processing. But we’ll also want to process that cart data in real time, so we can feed it right back into the online data set and use it immediately afterward.

We’ll start out by putting in an event streaming solution like Apache Pulsar. That way, all new cart activity is put on a Pulsar topic, where it is consumed and sent to both the underlying batch database as well as to help train our real-time ML model.

As for the latter, our Pulsar consumer will write to a Cassandra table (shown in Figure 2) designed simply to hold entries for each product in the order. The product then has a row for all of the other products from that and other orders:

CREATE TABLE order_products_mapping (
    id text,
    added_product_id text,
    cart_id uuid,
    qty int,
    PRIMARY KEY (id, added_product_id, cart_id)
) WITH CLUSTERING ORDER BY (added_product_id ASC, cart_id ASC);

Figure 2. Augmenting an existing batch-fed recommendation system with Apache Pulsar and Apache Cassandra.

We can then query this table for a particular product (“DSH915” in this example), like this:

SELECT added_product_id, SUM(qty)
FROm order_products_mapping
WHERE id='DSH915'
GROUP BY added_product_id;

 added_product_id | system.sum(qty)
------------------+-----------------
            APC30 |               7
           ECJ112 |               1
            LN355 |               2
            LS534 |               4
           RCE857 |               3
          RSH2112 |               5
           TSD925 |               1

(7 rows)

We can then take the top four results and put them into the product recommendations table, ready for the recommendation service to query by product_id:

SELECT * FROM product_recommendations
WHERE product_id='DSH915';

 product_id | tier | recommended_id | score
------------+------+----------------+-------
     DSH915 |    1 |          APC30 |     7
     DSH915 |    2 |        RSH2112 |     5
     DSH915 |    3 |          LS534 |     4
     DSH915 |    4 |         RCE857 |     3

(4 rows)

In this way, the new recommendation data is constantly being kept up to date. Also, all of the infrastructure assets described above are located in the local data center. Therefore, the process of pulling product relationships from an order, sending them through a Pulsar topic and processing them into recommendations stored in Cassandra happens in less than a second. With this simple data model, Cassandra is capable of serving the requested recommendations in single-digit milliseconds.

Conclusions and Next Steps

We’ll want to be sure to examine how our data is being written to our Cassandra tables in the long term. This way we can get ahead of potential problems related to things like unbound row growth and in-place updates.

Some additional heuristic filters may be necessary to add as well, like a “do not recommend” list. This is because there are some products that our customers will buy either once or infrequently, and recommending them will only take space away from other products that they are much more likely to buy on impulse. For example, recommending a purchase of something from our appliance division such as a washing machine is not likely to yield an “impulse buy.”

Another future improvement would be to implement a real-time AI/ML platform like Kaskada to handle both the product relationship streaming and to serve the recommendation data to the service directly.

Fortunately, we did come up with a way to augment the existing, sluggish batch process using Pulsar to feed the cart-add events to be processed in real time. Once we get a feel for how this system performs in the long run, we should consider shutting down the legacy batch process. The PE acknowledged that we made good progress with the new solution, and, better yet, that we have also begun to lay the groundwork to eliminate some technical debt. In the end, everyone feels good about that.

In an upcoming article, we’ll take a look at improving product promotions with vector searching.

Learn how DataStax enables real-time AI.

The post An E-tailer’s Journey to Real-Time AI: Recommendations appeared first on The New Stack.

How Dell’s Data Science Team Benefits from Agile Practices

Loraine Lawson — Thu, 15 Jun 2023 16:38:17 +0000

Agile development doesn’t work for data science… at least, not at first, said Randi Ludwig, Dell Technologies’ director of Data Science. That’s because, in part, there is an uncertainty that’s innate to data science, Ludwig told audiences at the Domino Data Lab Rev4 conference in New York on June 1.

“One of the things that breaks down for data science, in terms of agile development practices, is you don’t always know exactly where you’re going,” Ludwig said. “I haven’t even looked at that data. How am I supposed to know where do I even start with that?”

Nonetheless, Dell uses agile practices with its data science team and what Ludwig has found is that while there is a certain amount of uncertainty, it’s contained to the first part of the process where data scientists collect the data, prove there’s value and obtain sign-off from stakeholders. To manage that first part, she suggested time boxing it to three or four weeks.

“The uncertainty really only lies in the first part of this process,” she said. “What that agile looks like in the first half and then the second half of the process are different on a day-to-day basis for the team.”

After the uncertainty period, the rest of the data science process is more like software development, and agile becomes beneficial, she said.

Ludwig interwove how Dell implements agile practices in data science with the benefits the team reaps from those practices.

Benefits of Standups

First, standups should include anyone involved in a data science project, including data engineers, analysts and technical project managers, Ludwig said. Just talking to each other on a regular basis tends to fly in the face of how data scientists inherently work in isolation, but it helps put everyone on the same page and delivers value by adding context and avoiding rework. This pays dividends in that team members can step in for one another more than they can under the “lone wolf” approach to data science.

“Doing standups gives visibility to everybody else in the story,” she said. “That lack of context goes away just by talking to each other every day, and then if you actually write down what you talk about every day, you get other amazing benefits out of it.”

The standup doesn’t necessarily need to be every day, but it should be a recurring cadence that’s short enough that the project can’t go wildly afield, she added.

Benefits of Tickets

Documenting tickets is also a key practice that’s easy to do while alleviating single points of failure, she said, plus tickets have the benefit of not being onerous documentation.

“Just the fact of having things written down and talking to each other every day is massively beneficial, and in my experience is not how data science teams organically develop most of the time,” she said.

In the second half of the data science process, teams can articulate more clearly what exactly they’re going to do so tickets become possible. It’s important not to be too broad when writing tickets, however. Instead, break big ideas down into bite-sized chunks of work, she advised.

“‘I’m going to do EDA (exploratory data analysis) on finance data’ is way too broad. That’s way too big of a ticket. You’ve got to break those things down into smaller pieces,” she said. “Even just getting the team to articulate what are the some of the things you’re going to look for — you’re going to look for missing values, you’re going to look for columns that are high-quality data, you’re going to look to see if there’s any correlations between some of those columns — so that you’re not doing bringing in redundant features.”

It also helps inform the team about the why and how of the models being built. There can also be planning tickets that incorporate questions that need to be asked, she said.

Tickets become another form of data that can be used in year-end reviews and for the management of the team. For instance, one of Ludwig’s data scientists was able to demonstrate through tagged tickets how much time was spent on building data pipelines.

“Data scientists are not best at building data pipelines, you need data engineers for that,” Ludwig said. “This is a great tool because now I know that I need to either redistribute resources I have or go ask for more resources. I actually need more data engineers.”

Tickets can also be used to document problems encountered by the data science team. For instance, Ludwig was able to use tickets to show the database management team all the problems they were encountering with a particular database, thus justifying improvements to that database.

It can be challenging to get members to make tickets and keep them updated, she acknowledged, so she has everyone opened to Github so they can update the tickets during the standup.

Benefits of a Prioritization Log

Tickets also allow the team to create a prioritization log, she said. That triggers a slew of benefits, such as providing the team with support when there is pushback from stakeholders about requests.

“This magical thing happens where now you have stuff written down, which means you have a prioritization backlog, you can actually go through all of the ideas and thoughts you’ve had and figure out how to prioritize the work instead of just wondering,” she said. “You actually foster much less contentious relationships with stakeholders in terms of new asks by having all of the stuff written down.”

Stakeholders will start to understand that for the team to prioritize their request, they need to do some homework such as identifying what data sold be used, what business unit will consume the output of the data and what they think it should look like.

Another benefit: It can keep data scientists from wandering down rabbit holes as they explore the data. Instead, they should bring those questions to the standup and decide as a team for prioritizing.

”This helps you on your internal pipeline, as well as your intake with external stakeholders. Once they see that you have a list to work against, then they’re, ‘Oh, I need to actually be really specific about what I’m asking from you,’” she said.

Finally, there’s no more “wondering what the data science team is doing” and whether it will deliver benefits.

“One of the biggest concerns I’ve ever heard from leadership about data science teams is that they don’t know what your plan’s going to be, what are you going to deliver in 12 or 18 months, how many things I could learn between here that’s going to completely change whatever I tell you right now,” she said. “At least now you know that this investment has a path and a roadmap that’s going to continue to provide value for a long time.”

Benefits of Reviews and Retrospectives

“Stakeholders are just really convinced that people just disappear off into an ivory tower, and then they have no idea what are those data scientists doing,” Ludwig said.

There’s a lot of angst that can be eliminated just by talking with business stakeholders, which review sessions give you a chance to do. It’s important to take the time to make sure they understand what you’re working on, why and what you found out about it, and that you understand their business problem.

Retrospectives are also beneficial because they allow the data science team to reflect and improve.

“One of the things that I actually thought was one of the most interesting about data scientists or scientists at heart, they love to learn, they love to make things more efficient and optimize, but the number of teams that organically just decide to have retrospectives is very small, in my experience,” she said. “Having an organized framework of we’re going to sit down and periodically review what we’re doing and make sure we learn from it is an ad hoc thing that some people do or some people don’t. Just enforcing that regularly has a ton of value.”

Domino Data Lab paid for The New Stack’s travel and accommodations to attend the Rev4 conference.

The post How Dell’s Data Science Team Benefits from Agile Practices appeared first on The New Stack.

The Transformative Power of SBOMs and IBOMs for Cloud Apps

Oren Penso — Thu, 15 Jun 2023 16:20:44 +0000

As we continue to navigate the digital landscape, it is clear that every innovation brings with it a wealth of opportunities as well as a host of challenges. One of the most prevalent trends in today’s tech world is the increasing reliance on cloud-based applications. These applications offer flexibility, scalability and reliability but also introduce complexity, mainly when operating in multicloud or hybrid environments. We must adopt a fresh perspective to manage this ever-evolving IT ecosystem effectively.

In this blog post, I want to explore a transformative concept that could redefine the way we manage our business applications: the integration of the software bill of materials (SBOM) and infrastructure bill of materials (IBOM).

SBOM and IBOM: A Unified Approach to Tech Management

Traditionally, an SBOM serves as an inventory list detailing all components of software, including libraries and dependencies. It plays a crucial role in managing software updates, ensuring compliance and facilitating informed decision-making. However, in today’s intricate application landscape, having knowledge of the software alone is insufficient.

This is where the concept of the IBOM comes into play. An IBOM is a comprehensive list of all critical components a business application requires to run, including network components, databases, message queuing systems, cache layers systems, cloud infrastructure components and cloud services. By integrating an SBOM and an IBOM, we can better understand our application environment. This powerful combination enables us to effectively manage critical areas such as security, performance, operations, data protection and cost control.

The Business Benefits of SBOM and IBOM Integration

The integration of an SBOM and an IBOM offers numerous benefits that can enhance various aspects of business operations:

Security – A comprehensive view of both software and infrastructure components allows organizations to identify potential vulnerabilities early on. This level of visibility is critical for bolstering data protection and reducing overall risk. In essence, complete visibility acts as a safety net, enabling businesses to safeguard their digital assets from threats.
Performance – Detailed knowledge of software and infrastructure components can significantly enhance application performance. Improved performance translates into superior customer experiences and more efficient business operations, ultimately leading to increased customer satisfaction and profitability.
Operations – A complete view of all application components facilitates effective operational planning. This not only simplifies the deployment and maintenance of applications but also streamlines workflows and boosts operational efficiency.
Cost Control – The granular information provided by SBOMs and IBOMs enables businesses to make informed decisions, optimize resource utilization and manage costs effectively. By strategically deploying resources, businesses can eliminate unnecessary expenditures and invest in areas that offer the highest value.

Navigating the Complex World of Cloud-Based Applications

The rise of homegrown applications has led to a significant increase in the number of applications that need to be managed. Coupled with the shift toward cloud-based applications and the complexities associated with multicloud or hybrid environments, this trend underscores the importance of having a comprehensive SBOM and IBOM.

Without a thorough understanding of their application landscape, organizations may find it challenging to manage and prioritize operational and security tasks. SBOMs and IBOMs are indispensable tools for effective control and management in this rapidly evolving applications and infrastructure era.

Embracing the Future of Automation and Integration: The Role of GitOps

The future of business applications presents exciting opportunities for automation and integration. As the complexity and scale of applications continue to grow, manual management is becoming increasingly challenging. Automating the creation and maintenance of SBOMs and IBOMs is crucial to keeping pace with the rapidly changing tech landscape.

One of the most promising approaches to this automation and integration is GitOps. GitOps is a paradigm or a set of practices that empowers developers to perform tasks that typically fall under IT operations’ purview. GitOps leverages the version control system as the single source of truth for declarative infrastructure and applications, enabling developers to use the same git pull requests they use for code review and collaboration to manage deployments and infrastructure changes.

In the context of SBOMs and IBOMs, GitOps can automate the process of tracking and managing changes to both software and infrastructure components. By storing the SBOM and IBOM in a git repository, any changes to the software or infrastructure can be tracked and managed through git. This simplifies the management process and enhances visibility and traceability, which are crucial for security and compliance.

Moreover, these automated systems could be integrated into secure, automated supply chains, marking this technological revolution’s next phase. This is an exciting prospect and one that holds immense potential for businesses looking to streamline their operations and enhance their efficiency. With GitOps, the creation and maintenance of SBOMs and IBOMs become a part of the natural development workflow, making it easier to keep up with the fast-paced world of cloud-based applications.

The Role of SBOMs and IBOMs in Compliance and Auditing

Another significant advantage of integrating SBOMs and IBOMs is their crucial role in compliance and auditing. In today’s digital landscape, the emphasis on data privacy and security has never been greater. Businesses must adhere to many regulations, from data protection laws like GDPR and California Consumer Privacy Act (CCPA) to industry-specific regulations such as Health Insurance Portability and Accountability Act (HIPAA) in healthcare and Payment Card Industry Data Security Standard (PCI DSS) in finance.

Having comprehensive SBOMs and IBOMs provides the necessary transparency and traceability to meet these regulatory requirements. They serve as a detailed inventory of all software and infrastructure components, including their versions, configurations and interdependencies. This level of detail is crucial for demonstrating compliance with regulations requiring businesses to thoroughly understand their IT environment.

For instance, in the event of a data breach, an SBOM and IBOM can help a team identify which components were affected and assess the extent of the breach. This can aid in incident response and reporting, both of which are key requirements of data protection regulations.

The integration of SBOM and IBOM is not just about managing complexity in the cloud-based app era. It’s also about ensuring that businesses can meet their compliance obligations and maintain the trust of their customers in an increasingly regulated and security-conscious digital landscape.

The Future Is Integrated

As we continue to navigate the digital future, it’s clear that the integration of SBOMs and IBOMs will play a pivotal role in managing the complexity of cloud-based applications. Providing a comprehensive view of our application environment can help businesses enhance security, improve performance, streamline operations and control costs.

The future of business applications is undoubtedly integrated. By embracing the power of SBOMs and IBOMs, businesses can not only navigate the complexities of the digital landscape but also unlock new opportunities for growth and innovation. As we continue to explore the potential of these tools, one thing is clear: The future of tech management is here, and it’s integrated.

The post The Transformative Power of SBOMs and IBOMs for Cloud Apps appeared first on The New Stack.

Apache SeaTunnel Integrates Masses of Divergent Data Faster

Susan Hall — Thu, 15 Jun 2023 13:58:46 +0000

The latest project to reach top-level status with the Apache Software Foundation (ASF) was designed to solve common problems in data integration. Apache SeaTunnel can ingest and synchronize massive amounts of data from disparate sources faster, greatly reducing the cost of data transfer.

“Currently, the big data ecosystem consists of various data engines, including Hadoop, Hive, Kudu, Kafka, HDFS for big data ecology, MongoDB, Redis, ClickHouse, Doris for the generalized big database ecosystem, AWS S3, Redshift, BigQuery, Snowflake in the cloud, and various data ecosystems like MySQL, PostgreSQL, IoTDB, TDEngine, Salesforce, Workday, etc.,” Debra Chen, community manager for SeaTunnel, wrote in an email message to The New Stack.

“We need a tool to connect these data sources. Apache SeaTunnel serves as a bridge to integrating these complex data sources accurately, in real-time, and with simplicity. It becomes the ‘highway’ for data flow in the big data landscape.”

The open source tool is described as an “ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data.” We’re talking tens of billions of data points a day.

Efficient and Rapid Data Delivery

Begun in 2017 and originally called Waterdrop, the project was renamed in October 2021 and entered the ASF incubator in December the same year. Created by a small group in China, SeaTunnel since has grown to more than 180 contributors around the world.

Built in Java and other languages, and it consists of three main components: source connectors, transfer compute engines and sink connectors. The source connectors read data from the source end (it could be JDBC, binlog, unstructured Kafka or Software as a Service API, or AI data models) and transform the data into a standard format understood by SeaTunnel.

Then the transfer compute engines process and distribute the data (such as data format conversion, tokenization, etc.). Finally, the sink connector transforms the SeaTunnel data format into the format required by the target database for storage.

“Of course, there are also complex high-performance data transfer mechanisms, distributed snapshots, global checkpoints, two-phase commits, etc., to ensure efficient and rapid data delivery to the target end,” Chen said.

SeaTunnel provides a connector API that does not depend on a specific execution engine. While it uses its own SeaTunnel Engine for data synchronization by default, it also supports multiple versions of Spark and Flink. The plug-in design allows users to easily develop their own connector and integrate it into the SeaTunnel project. It currently supports more than 100 connectors.

It supports various synchronization scenarios, such as offline-full synchronization, offline-incremental synchronization, change data capture (CDC), real-time synchronization and full database synchronization.

Enterprises use a variety of technology components and must develop corresponding synchronization programs for different components to complete data integration. Existing data integration and data synchronization tools often require vast computing resources or Java database connectivity connection resources to complete real-time synchronization. SeaTunnel aims to ease these burdens, making data transfer faster, less expensive and more efficient.

New Developments in the Project

In October 2022, SeaTunnel released its major version 2.2.0, introducing SeaTunnel Zeta engine, its data integration-specific computing engine and enabling cross-engine connector support.

Last December it added support for CDC synchronization, and earlier this year added support for Flink 1.15 and Spark 3. The Zeta engine was enhanced to support CDC full-database synchronization, multi-table synchronization, schema evolution and automatic table creation.

The community also recently submitted SeaTunnel-Web, which allows users not only to use SQL-like languages for transformation but also to directly connect different data sources, using a drag-and-drop interface.

“Any open source user can easily extend their own connector for their data source, submit it to the Apache community, and enable more people to use it,” Chen said. “At the same time, you can quickly solve the data integration issues between your enterprise data sources by using connectors contributed by others.”

SeaTunnel is used in more than 1,000 enterprises, including Shopee, Oppo, Kidswant and Vipshop.

What’s Ahead for SeaTunnel?

Chen laid out these plans for the project going forward:

SeaTunnel will further improve the performance and stability of the Zeta engine and fulfill the previously planned features such as data definition language change synchronization, error data handling, flow rate control and multi-table synchronization.
SeaTunnel-Web will transition from the alpha stage to the release stage, allowing users to define and control the entire synchronization process directly from the interface.
Cooperation with the artificial general intelligence component will be strengthened. In addition to using ChatGPT to automatically generate connectors, the plan is to enhance the integration of vector databases and plugins for large models, enabling seamless integration of over 100 data sources.
The relationship with the upstream and downstream ecosystems will be enhanced, integrating and connecting with other Apache ecosystems such as Apache DolphinScheduler and Apache Airflow. Regular communication occurs through emails and issue discussions, and major progress and plans of the project and community are announced through community media channels to maintain openness and transparency.
After supporting Google Sheets, Feishu (Lark), and Tencent Docs, it will focus on constructing SaaS connectors, such as ChatGPT, Salesforce and Workday.

The post Apache SeaTunnel Integrates Masses of Divergent Data Faster appeared first on The New Stack.

Can Companies Really Self-Host at Scale?

Jessica Wachtel — Wed, 14 Jun 2023 14:47:06 +0000

There’s no such thing as free lunch, or in this case, free software. It’s a myth. Paul Vixie, vice president of security at Amazon Web Services, creator of the original Domain Name System (DNS), gave a compelling presentation at Open Source Summit Europe 2022 about this topic. His presentation included a comprehensive list of “dos and don’ts” for consumers of free software. Vixie’s docket included labor-intensive, often expensive, engineering work that ran the gamut of small routine upgrades to locally maintaining orphaned dependencies.

To sum the “dos and don’ts” up in one sentence though, engineer(s) are always working, monitoring, watching and ready for action. This “ready for action” engineer must have high-level expertise so that they can handle anything that comes their way. Free software isn’t inherently bad, and it definitely works. Identifying the hidden costs of selecting software also applies to the decision to self-host a database. Self-hosting is effective for many companies. But when is it time to let go and try the easier way?

What Is a Self-Hosted Database?

Self-hosted databases come in many forms. Locally hosted open source databases are the most obvious example. However, many commercial database products have tiered packages that include self-managed options. On-premises hosting comes with pros and cons: low security risk, the ability to work directly beside the data and complete control over the database are a few advantages. There is, of course, the problem with scaling. Self-hosting creates challenges for any business or developer team with spiky or unreliable traffic because on-demand scaling is impossible. Database engineers must always account for the highest amount of traffic with on-premises servers or otherwise risk an outage in the event of a traffic spike.

For businesses that want to self-host and scale on demand, self-hosting in the cloud is another option. This option allows businesses with spiky or less predictable traffic to scale alongside their needs. When self-hosting in the cloud, the cloud provider installs and hosts their database on a virtual machine in a traditional deployment model. When you’re hosting a commercial database in the cloud, support for cloud and the database is minimal because self-hosted always means your engineering resources helm the project. This extends to emergencies like outages and even security breaches.

The Skills Gap

There are many skilled professionals with experience managing databases at scale on-premises and in the cloud. SQL databases were the de facto database for decades. Now, with the rise of more purpose-built databases geared toward deriving maximum value from the data points they’re storing, the marketplace is shifting. Newer database types that are gaining a foothold within the community are columnar databases, search engine databases, graph databases and time series databases. Now developers familiar with these technologies can choose what they want to do with their expertise.

Time Series Data

Gradient Flow expects the global market for time series analysis software will grow at a compound annual rate of 11.5% from 2020 to 2027. Time series data is a vast category and includes any data with a timestamp. Businesses collect time series data from the physical world through items like consumer Internet of Things (IoT), industrial IoT and factory equipment. Time series data originating from online sources include observability metrics, logs, traces, security monitoring and DevOps performance monitoring. Time series data powers real-time dashboards, decision-making and statistical and machine learning models that heavily influence many artificial intelligence applications.

Bridging the Skills Gap

InfluxDB 3.0 is a purpose-built time series database that ingests, stores and analyzes all types of time series data in a single datastore, including metrics, events and traces. It’s built on top of Apache Arrow and optimized for scale and performance, which allows for real-time query responses. InfluxDB has native SQL support and open source extensibility and interoperability with data science tools.

InfluxDB Cloud Dedicated is a fully managed, single-tenant instance of InfluxDB created for customers who require privacy and customization without the challenges of self-hosting. The dedicated infrastructure is resilient and scalable with built-in, multi-tier data durability with 2x data replication. Managed services mean around-the-clock support, automated patches and version updates. A higher level of customization is also a characteristic of InfluxDB Cloud Dedicated. Customers choose the cluster tier that best matches their data and workloads for their dedicated private cloud resources. From the many customizable characteristics, increased query timeouts and in-memory caching are two.

Conclusion

It’s up to every organization to decide whether to self-manage or choose a managed database. Decision-makers and engineers must have a deep understanding of the organization’s needs, traffic flow patterns, engineering skills and resources and characteristics of the data before reaching the best decision.

To get started, check out this demo of InfluxDB Cloud Dedicated, contact our sales team or sign up for your free cloud account today.

The post Can Companies Really Self-Host at Scale? appeared first on The New Stack.

Reducing Complexity with a Multimodel Database

Ted Neward — Tue, 13 Jun 2023 19:42:48 +0000

“Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation).”

With these words, E.F. Codd (known as “Ted” to his friends) began the seminal paper that begat the “relational wave” that would spend the next 50 years dominating the database landscape.

“Activities of users at terminals and most application programs should remain unaffected when the internal representation of data is changed and even when some aspects of the external representation are changed.”

When Codd wrote this paper back in 1969, data access was in its infancy: Programmers wrote code that accessed flat files or tables and followed “pointers” from a row in one file to a row in a separate file. By introducing a “model” of data that encapsulated the underlying implementation (of how data was stored and retrieved) and putting a domain-specific language (in this case, SQL) in front of that model, programmers found their interaction with the database elevated away from the physical details of the data, and instead were free to think more along the logical levels of their problem, code and application.

Whether Codd knew this or not, he was tapping into a concept known today as a “complexity budget:” the idea that developers — any organization, really — can only keep track of so much complexity within their projects or organization. When a project reaches the limits of that budget, the system starts to grow too difficult to manage and all sorts of inefficiencies and problems arise — difficulties in upgrading, tracking down bugs, adding new features, refactoring code, the works. Codd’s point, really, was simple: If too much complexity is spent navigating the data “by hand,” there is less available to manage code that captures the complexities of the domain.

Fifty years later, we find ourselves still in the same scenario — needing to think more along logical and conceptual lines rather than the physical details of data. Our projects wrestle with vastly more complex domains than ever before. And while Codd’s model of relational data has served us well for over a half-century, it’s important to understand that the problem, in many respects, is still there — different in detail than the one that Codd sought to solve, but fundamentally the same issue.

Models in Nature

In Codd’s day, data was limited in scope and nature, most of it business transactions of one form or another. Parts had suppliers; manufacturers had locations; customers had orders. Creating a system of relationships between all of these was a large part of the work required by developers.

Fifty years later, however, data has changed. Not only has the amount of data stored by a business exploded by orders of magnitude (many orders of magnitude), but the shape of the data generated is wildly more irregular than it was in Codd’s day. Or, perhaps fairer to say, we capture more data than we did 50 years ago, and that data comes in all different shapes and sizes: images, audio and video, to start, but also geolocation information, genealogy data, biometrics, and that’s just a start. And developers are expected to be able to weave all of it together into a coherent fabric and present it to end users in a meaningful way. And — oh, by the way — the big launch is next month.

For its time, Codd’s relational model provided developers with exactly that — a way to weave data together into a coherent fabric. But with the growth of and changes to the data with which we have to contend, new tactics, ones which didn’t throw away the relational model but added upon it, were necessary.

We wrought what we could using the concept of “polyglot persistence,” the idea of bringing disparate parts together into a distributed system. But as any experienced architect will be all too familiar, the more different and distinct nodes in a distributed system, the greater the complexity. And the more complexity we must spend on manually stitching together data from different nodes in the database system, the less we have to spend on the complexity of the domain.

Nature of Storage

But complexity doesn’t live just in the shape of the data we weave; it also lives in the very places we store it.

What Codd hadn’t considered, largely because it was 50 years too early, is that databases also carry with them a physical concern that has to do with the actual physical realm — the servers, the disks on which the data is stored, the network and more. For decades, an organization “owning” a database has meant a non-trivial investment into all the details around what that ownership means, including the employment of a number of people whose sole purpose is the care and feeding of those machines. These “database administrators” were responsible for machine procurement and maintenance, software upgrades and patches, backups and restorations and more — all before ever touching the relational schema itself.

Like the “physical” details of data access 50 years ago, devoting time to the physical details of the database’s existence is also a costly endeavor. Between the money and time spent doing the actual maintenance as well as the opportunity cost of it being offline and unavailable for use, keeping a non-trivial database up and running is a cost that can often grow quite sizable and requires deep amounts of ongoing training and learning for those involved.

Solutions

By this point, it should be apparent that developers need to aggressively look for ways to reduce accidental and wasteful spending of complexity. We seek this in so many ways; the programming languages we use look for ways to promote encapsulation of algorithms and data, for example, and libraries and services tuck away functionality behind APIs.

Providing a well-encapsulated data strategy in the modern era often means two things: the use of a multimodel database to bring together the differing shapes of data into a single model, and the use of a cloud database provider to significantly reduce the time spent managing the database’s operational needs. Which one you choose is obviously the subject of a different conversation — just make sure it’s one that supports all the models your data needs, in an environment that requires the smallest management necessary.

Multimodel brings all the benefits of polyglot persistence, without the disadvantages of it. Essentially, it does this by supporting a document store (JSON documents), a key/value store and other data storage models (multiple databases) into one database engine that has a common query language and a single API for further access. Learn more about Couchbase’s multimodel database here, and try Couchbase for yourself today with our free trial.

The post Reducing Complexity with a Multimodel Database appeared first on The New Stack.

3 Ways to Drive Open Source Software Maturity

Jay Livens — Tue, 13 Jun 2023 19:00:53 +0000

Open source software (OSS) is taking over the world. It’s a faster, more collaborative and flexible way of driving software innovation than proprietary code. This flexibility appeals to developers and can help organizational leadership drive down costs while supporting digital transformation goals. The figures speak for themselves: 80% of organizations increased their OSS use in 2022, especially those operating in critical infrastructure sectors such as oil and gas, telecommunications and energy.

However, open source is not a panacea. There can be challenges around governance, security and the balance between contributing to OSS development and preserving a commercial advantage. These each need careful consideration if developers want to maximize the impact of their work on open source projects.

Open Source Software Saves Time and Drives Innovation

There’s no one-size-fits-all approach with OSS. Projects could range from relatively small software components, such as general-purpose Java class libraries, to major systems, such as Kubernetes for container management or Apache’s HTTP server for modern operating systems. Those projects receiving regular contributions from reputable sources are likely to be most widely adopted and frequently updated. But there is already a range of proven benefits across them all.

Open source can save time and resources, as developers don’t have to expend their own energies to produce code. The top four OSS ecosystems are estimated to have recorded over 3 trillion requests for components last year. That’s a great deal of effort potentially saved. It also means those same developer teams can focus more fully on proprietary functionality that advances the baseline functionality available through OSS to boost revenue streams. It’s estimated just $1.1 billion invested in OSS in the EU back in 2018 generated $71 billion to $104 billion for the regional economy.

OSS also encourages experts from across the globe — whether individual hobbyists or DevOps teams from multinational companies — to contribute their coding skills and industry knowledge. The idea is projects will benefit from a large and diverse pool of developers, driving up the quality of the final product. In contributing to these projects, businesses and individuals can stake a claim to the future direction of a particular product or field of technology, helping to shape it in a way that advances their own solutions. Companies also benefit from being at the leading edge of any new discoveries and leaps in innovation as they emerge, so they can steal a march on the competition by being first to market.

This, in turn, can help to drive a culture of innovation at organizations that contribute regularly to OSS. Alongside a company’s track record on patents, their commitment to OSS projects can be a useful indicator to prospective new hires of their level of ambition, helping attract the brightest and best talent going forward.

Three Ways to Drive OSS Maturity

To maximize the benefit of their contributions to the OSS community, DevOps leaders should ensure their organization has a clear, mature approach. There are three key points to consider in these efforts:

1. Define the Scope of the Organization’s Contribution

OSS is built on the expertise of a potentially wide range of individuals and organizations, many of whom are otherwise competitors. This “wisdom of the crowd” can ultimately help to create better-quality products more quickly. However, it can also raise difficult questions about how to keep proprietary secrets under wraps when there is pressure from the community to share certain code bases or functionality that could benefit others. By defining at the outset what they want to keep private, contributors can draw a clear line between commercial advantage and community spirit to avoid such headaches later.

2. Contribute to Open Standards

Open standards are the foundation on which OSS contributors can collaborate. By getting involved in these initiatives, organizations have a fantastic opportunity to shape the future direction of OSS, helping to solve common problems in a manner that will enhance the value of their commercial products. OpenTelemetry is one such success story. This collection of tools, application programming interfaces and software development kits simplifies the capture and export of telemetry data from applications to make tracing more seamless across boundaries and systems. As a result, OpenTelemetry has become a de facto industry standard for the way organizations capture and process observability data, bringing them closer to achieving a unified view of hybrid technology stacks in a single platform.

3. Build Robust Security Practices

Despite the benefits of OSS, there’s always a risk of vulnerabilities slipping into production if they’re not detected and remediated quickly and effectively in development environments. Three-quarters (75%) of chief information security officers (CISOs) worry the prevalence of team silos and point solutions throughout the software development lifecycle makes it easier for vulnerabilities to fly below the radar. Their concerns are valid. The average application development project contains 49 vulnerabilities, according to one estimate. These risks will only grow as ChatGPT-like tools are increasingly used to support software development by compiling code snippets from open source libraries.

Given the dynamic, fast-changing nature of cloud native environments and the sheer scale of open source use, automation is the only way DevOps teams can take control of the situation. To support this, they should converge security data with real-time, end-to-end observability to create a unified source of insights. By combining this with trustworthy AI that can understand the full context behind that observability and security data, teams can unlock precise, real-time answers about vulnerabilities in their environment. Armed with those answers, they can implement security gates throughout the delivery pipeline so bugs are automatically resolved as soon as they are detected.

OSS is increasingly important to long-term success, even for commercially motivated organizations. How effectively they’re able to harness and contribute to its development will define the winners and losers of the next decade. If they put careful consideration into these three key points, DevOps leaders will bring their organizations much closer to being recognized as a leading innovator in their industries.

The post 3 Ways to Drive Open Source Software Maturity appeared first on The New Stack.

The First Kubernetes Bill of Materials Standard Arrives

Steven J. Vaughan-Nichols — Tue, 13 Jun 2023 17:48:51 +0000

If you’re not using a Software Bill of Materials (SBOM) yet, you will be soon. They’re seen as essential groundwork for building code security defense. While there are many SBOM standards, such as Software Package Data Exchange (SPDX), CycloneDX: and GitHub’s dependency submission format, there hasn’t been one just for the popular container orchestration program Kubernetes until now: Kubernetes Security Operations Center’s (KSOC) Kubernetes Bill of Materials (KBOM) standard.

At this early stage, KBOM is a rough first draft. It provides an initial specification in JavaScript Object Notation (JSON) It’s been shown to work with Kubernetes 1.19 and newer; hyperscale cloud services providers; and do-it-yourself Kubernetes.

With the KBOM’s shell interface, cloud security teams can gain a comprehensive understanding of third-party tooling within their environment. This development is aimed at enabling quicker responses to the surge of new Kubernetes tooling vulnerabilities.

Is It Necessary?

Is there really a need for this, though, since there are many SBOM standards? Since Kubernetes is used by over 96% of organizations to orchestrate container deployments, clearly there’s a deployment security gap here. After all, Kubernetes security adoption remains low, with 34% in 2022. A major barrier to securing Kubernetes is getting an accurate grasp of the environment’s scope.

As KSOC CTO Jimmy Mesta explained: “Kubernetes is orchestrating the applications of many of the biggest business brands we know and love. Adoption is no longer an excuse, and yet from a security perspective, we continually leave Kubernetes itself out of the conversation when it comes to standards and compliance guidelines, focusing only on the activity before application deployment.” Therefore, “We are releasing this KBOM standard as a first step to getting Kubernetes into the conversation when it comes to compliance guidelines. ”

To meet these needs, KBOM offers a concise overview of a Kubernetes cluster’s elements. These include:

Workload count.
Cost and type of hosting service.
Vulnerabilities for both internal and hosted images.
Third-party customization, for example, the deployed custom resources, authentication, and service mesh.
Version details for the managed platform, the Kubelet, and more.

Sounds interesting? It should. To contribute, you can download the CLI tool today or learn more about the standard. You can also work on this Apache 2 open source program via its GitHub page.

The post The First Kubernetes Bill of Materials Standard Arrives appeared first on The New Stack.

A CTO’s Guide to Navigating the Cloud Native Ecosystem

Jess Lulka — Tue, 13 Jun 2023 16:39:29 +0000

While container and cloud technology are increasingly mature, there are still a lot of different software, staffing and architecture considerations that CTOs must address to ensure that everything runs smoothly and operates together.

The Gartner “A CTO’s Guide to Navigating the Cloud Native Container Ecosystem” report estimates that by 2028, more than 95% of global organizations will be running containerized applications in production, which is a significant increase from fewer than 50% in 2023.

This level of adoption means that organizations must have the right software to effectively manage, monitor and run container-based, cloud native environments. And there is a multitude of options for CTOs and enterprise architecture (EAs) leaders to sift through, which makes it hard to get environments level-set and to standardize processes.

“Despite the apparent progress and continued industry consolidation, the ecosystem remains fragmented and fast-paced. This makes it difficult for EAs andCTOs to build robust cloud native architectures and institute operational governance,” the authors state.

As container adoption expands for cloud native environments, more IT leaders will see an increase in both vendor and open source options. Such variety makes it harder to select the right tools to run a cloud native ecosystem and stretches out the evaluation process.

Here’s a look at container ecosystem components, software offerings and how CTOs can evaluate the best configuration for their organization.

What Are the Components of Container-Based Cloud Native Ecosystems?

Gartner explains that “containers are not a monolithic technology, the ecosystem is a hodgepodge of several components vital for production readiness.”

The foundation of a containerized ecosystem includes:

Container runtime lets developers deploy applications, configurations and other container image dependencies.
Container orchestrator supports features for policy-based deployment, application configuration management, high availability cluster establishment and container integration into overall infrastructure.
Container management software provides a management console, automation features, plus operational, security and developer tools. Vendors in this sector include Amazon Web Services (AWS), Microsoft, Google, RedHad, SUSE and VMware.
Open source tools and code: The Cloud Native Computing Foundation is the governance body that hosts several open source projects in this space.

These components all help any container-based applications run on cloud native architecture to support business functions and IT operations, such as DevOps, FinOps, observability, security and APIs. There are lots of open source projects that support all of these architectural components and platform engineering tools for Kubernetes.

At the start of cloud native ecosystem adoption, Gartner recommends:

Map your functional requirements to the container management platforms and identify any gaps that can be potentially filled by open source projects and commercial products outlined in this research for effective deployments.

Choose open source projects carefully, based on software release history, the permissiveness of software licensing terms and the vibrancy of the community, characterized by a broad ecosystem of vendors that provide commercial maintenance and support.

What Are the Container Management Platform Components?

Container management is an essential part of cloud native ecosystems; it should be top of mind during software selection and container environment implementation. But legacy application performance monitoring isn’t suited for newer cloud technology.

Cloud native container management platforms include the following tools:

Observability enables a skilled observer — a software developer or site reliability engineer — to effectively explain unexpected system behavior. Gartner mentions Chronosphere for this cloud native container management platform.
Networking manages communication inside the communication pod, between cluster containers and from the outside world.
Storage delivers granular data services, high availability and performance for stateful applications with deep integration with the container management systems.
Ingress control gatekeeps network communications of a container orchestration cluster. All inbound traffic to services inside the cluster must pass through the ingress gateway.
Security and compliance provides assessment of risk/trust of container content, secrets management and Kubernetes configurations. It also extends into production with runtime container threat protection and access control.
Policy-based management lets IT organizations programmatically express IT requirements, which is critical for container-based environments. Organizations can use the automation toolchain to enforce these policies.

More specific container monitoring platform components and methodologies include Infrastructure as Code, CI/CD, API gateways, service meshes and registries.

How to Effectively Evaluate Software for Cloud Native Ecosystems

There are two types of container platforms that bring all required components together: integrated cloud infrastructure and platform services (CIPS) and software for the cloud.

Hyperscale cloud providers offer integrated CIPS capabilities that allow users to develop and operate cloud native applications with a unified environment. Almost all of these providers can deliver an effective experience within their platforms, including some use cases of hybrid cloud and edge. Key cloud providers include Alibaba Cloud, AWS, Google Cloud, Microsoft Azure, Oracle Cloud, IBM Cloud and Tencent.

Vendors in this category offer on-premises, edge solutions and may offer either marketplace or managed services offerings in multiple public cloud environments. Key software vendors include Red Hat, VMware, SUSE (Rancher), Mirantis, HashiCorp (Nomad), etc.

Authors note critical factors of platform provider selection include:

Automated, secure, and distributed operations
- Hybrid and multicloud
- Edge optimization
- Support for bare metal
- Serverless containers
- Security and compliance
Application modernization
- Developer inner and outer loop tools
- Service mesh support
Open-source commitment
Pricing

IT leaders can figure out which provider has the most ideal offering if they match software to their infrastructure (current and future), security protocols, budget requirements, application modernization toolkit and open source integrations.

Gartner recommends that organizations:

Strive to standardize on a consistent platform, to the extent possible across use cases, to enhance architectural consistency, democratize operational know-how, simplify developer workflow and provide sourcing advantages.

Create a weighted decision matrix by considering the factors outlined above to ensure an objective decision is made.

Prioritize developers’ needs and their inherent expectations of operational simplicity, because any decision that fails to prioritize the needs of developers is bound to fail.

Read the full report to learn about ways to effectively navigate cloud native ecosystems.

The post A CTO’s Guide to Navigating the Cloud Native Ecosystem appeared first on The New Stack.

At PlatformCon: For Realtor.com, Success Is Driven by Stories

Jennifer Riggins — Tue, 13 Jun 2023 16:31:53 +0000

You’re only as good as the stories you tell. Storytelling, after all, is a tenet of humanity, and the best way to pass information, at least when it’s anchored in context. It’s also a pillar of successful sales. No matter what you’re selling or who you’re selling it to.

For platform engineering, your eager or not-so-eager audience is made up of your colleagues, the internal developers as well as other company-wide stakeholders and influencers. You have to understand the context and needs of your different target personas, and how they could respond to the changes you’re making. Much of intentional developer experience and platform adoption hinges on your ability to convey what works and what hasn’t, often socratically repeating back to be sure you comprehend your stakeholders’ stakes — and making sure they feel heard.

For Realtor.com, a platform engineering mindset is anchored in the power of success stories. Suzy Julius, SVP of product and engineering, joined the virtual PlatformCon stage to share how the top U.S. real estate site, with 86 million visits per month, went from a culture where you couldn’t say platform to a culture that embraces it.

The First Step Is always Recognition

Realtor.com is a company that’s over the last couple of years scaled mainly via acquisition, which often results in either spaghetti architecture or a complete lack of visibility into other business units. It pretty much always signals an increase in complexity.

“Our tech stack became extremely complex, slowing down our ability to build features in a fast and reliable way,” Julius said. “The existing tech stack made it difficult to ensure a quality product or ensure reliable feature releases.”

Facing its divergent and often duplicated tech ecosystem, in 2020, the company embarked on a transformation, with the aim to “simplify to scale” in order to accelerate innovation.

A platform emerged as the solution.

When Julius joined the company at the start of 2021, her team recognized the common barriers to entry to platform adoption, mainly, “knowing that there was a reluctance to building a platform, with fear that one would slow down the engineering team by creating more complexity.” Not an uncommon hurdle for platform engineers to face at all.

So the platform team kicked off this journey gaining feedback from a diverse background of stakeholders, not just from engineering, but from business and security, and offered a compelling success story, she explained. Now, 150 people are considered part of the platform organization — a mix of product leaders and engineers, who she said are all “focused on developer experience, data, content and personalization.”

Next, It’s Time to Adopt a Product Mindset

Come 2022 and the platform team was embracing a platform mindset, concentrating on developer enablement and providing a service to their colleagues. Specifically, Julius outlined the aims as:

To provide service to others to help everyone go faster and more reliably.
To understand as a platform team the vision and principles, and then to get corporate buy-in.
To be able to show short-term and long-term wins.
To measure, iterate and evangelize the vision to be a platform empowering all products and unlocking business opportunities.

These goals, she said, mostly focused on developer experience, but they also created a data platform for a “clear line of sight to understand business metrics or give analytics the ability to create a canonical source of truth dataset for our consumer and customers.”

The tech stack that drove this sociotechnical change included:

For developer experience — CircleCI, Apollo supergraph, GraphQL, Amazon EKS. ArgoCD, Tyk API gateway, Vault developer portal
For data, content and personalization — Fivetran automated data movement platform, Snowflake for data integration, Apache Kafka, DBT for data warehousing, Apache Airflow, NodeJS, Amazon SageMaker for machine learning, Optimizely, Metaflow data science framework, ElasticSearch

All the platform tech, people and processes are aligned around the vision to become the preferred platform on which their internal customers choose to build. That is grounded, Julius explained, in connecting wins with features that drive business metrics, namely, revenue and/or user engagement.

She highlighted sociotechnical lessons they learned over the past year:

A platform mindset is not just a technical but a cultural shift.
Adoption hinges on training, documentation and awareness.
You need a tighter feedback loop to establish stakeholder sentiment.
Be aware not to over-index on microservices. For example, they had rate-limiting in different locations, which Julius said made it hard to build client features.
Align around a few programming languages, as too many make it much harder to build cross-company platform features like logging and monitoring.
And, in a time of tighter budgets, make sure you commit to continuously invest in your platform service, no matter what.

Keep up the Momentum

Now, this year at Realtor.com is all about embracing the Platform as a Product mindset and building a differentiated, self-service product suite. Treating your platform as a product is about treating your developers like your customers, always focusing on improving developer experience or DevEx. For Realtor.com, this includes continuous feedback and stakeholder scorecards.

This year is about “understanding that we need to continue to solve problems and to make it easy and intuitive to use our platform,” Julius said. “And we need to realize gains beyond tech, [like] more involvement and input into what the platforms do and how they can help the entire company.”

Many of the platform engineering thought leaders The New Stack has interviewed have talked about the importance of using the platform as a single pane of glass to create a common language between business and engineering. This helps business understand the value of the big cost center that is engineering, while engineering can better connect their work to driving real business value to end customers. Julius’s team stands out in looking to leverage the platform to measure that effect. She said they are currently working “to incorporate how platforms impact our end-user strategy and experience,” connecting the DevEx to the DevOps.

They are also working out how to evangelize the platform internally. Like with all things, communication is key, including around onboarding and design-first thinking. They are customizing their messaging for different stakeholders. Julius noted they all have to get comfortable repeating themselves to not get lost in the email and Slack cacophony. The platform team is also considering adopting a tool like Backstage to help facilitate that internal product marketing and to, as she said, “bring it all together.”

All this feeds into a continued highlighting of performance, security and reliability gains.

How Mature Is Your Platform?

Platform teams are cost centers, but, until recently, developer productivity wasn’t something that could be easily measured. This means that platform teams have had difficulty assessing their performance and impact. Last month, a new DevEx framework came out that examines developers’ flow state, feedback loops, and cognitive load.

The month before, the Syntasso team open-sourced their Platform Maturity Model which guides teams to answering the following questions:

How does the company value (and therefore fund) platform efforts?
What compels users to start, and be successful, using your platform?
How do users interact with and consume offerings on your platform?
How are requests and requirements identified and prioritized on your platform?
How does product engineering manage non-differentiating (and often internally common) tooling and infrastructure?
How does each business requirement (e.g. compliance or performance) get enabled by platform offerings?

Each of these questions has answers from Levels 1 through 4 to mark maturity of a platform team.

The Realtor.com platform team has created what it refers to as a playbook — an artifact that helps continuously build onto the organization’s Platform-as-a-Product culture. This includes their own maturity model. “It’s recognizing and reminding us that we don’t want to stop at a platform that just works, but we want to be seen for the good and invested in,” Julius said.

Pulling a metaphor for the company’s core industry, she compared a platform to a house. There are parts that you don’t really notice until something goes wrong like a window won’t open or the foundation is cracked. She explained that “Where we strive to mature as a platform when you notice the doors, you notice the windows, and they’re seen for the good.”

Next, the playbook features two decision-making frameworks to decide when to slow down or to speed up. She called them a flywheel to show off how they make decisions collaboratively and cross-functionally, “in a way that we can keep coming back and pointing at that decision as we progress.” They are:

Strategic technical initiative group (STIG) — to ensure technical decisions are made collaboratively and consider the future tech stack and feature development.
Cross-functional workshops — to collaborate and focus on both the Platform-as-a-Product and tech strategy.

Finally, the playbook centers on identity — which Julius said she could’ve given a whole talk around, it’s that essential to the Realtor.com product team. Identity leans into the importance of vision and purpose. A platform team always needs empathy, she argues, putting itself in its stakeholders’ shoes to better understand the technology and onboarding. It’s treating internal customers with the same level of care as external users.

Identity is all about understanding what a success story looks like and working backward to identify key aspects of that story, Julius explained, aligning that story with key decisions and remaining focused on the vision. It’s always about maintaining the organization’s reputation and grounding every decision in context.

“This is all about having the end state in mind, combining the fundamentals with your vision. It’s that compelling story of success.”

The post At PlatformCon: For Realtor.com, Success Is Driven by Stories appeared first on The New Stack.

A New Tool for the Open Source LLM Developer Stack: Aviary

Richard MacManus — Tue, 13 Jun 2023 14:53:00 +0000

The company behind Ray, an open source AI framework that helps power ChatGPT, has just released a new tool to help developers work with large language models (LLMs). Called Aviary, Anyscale describes it as the “first fully free, cloud-based infrastructure designed to help developers choose and deploy the right technologies and approach for their LLM-based applications.” Like Ray, Aviary is being released as an open source project.

I spoke to Anyscale’s Head of Engineering, Waleed Kadous, to discuss the new tool and its impact on LLM applications.

The goal of Aviary is to enable developers to identify the best open source platform to fine-tune and scale an LLM application. Developers can submit test prompts to a pre-selected set of LLMs, including Llama, CarperAI, Dolly 2.0, Vicuna, StabilityAI, and Amazon’s LightGPT.

The Emergence of an Open Source LLM Stack

I told Kadous that there’s an emerging developer ecosystem building up around AI and LLMs; I mentioned LangChain and also Microsoft’s new Copilot stack as examples. I asked how Aviary fits into this new ecosystem.

He replied that we are witnessing the development of an open source LLM stack. He drew a parallel to the LAMP stack of the 1990s and early 2000s (which I also did, in my LangChain post). In the open source LLM stack, he continued, Ray serves as the bottom layer for orchestration and management. Above that, there is an interface for model storage and retrieval — something like Hugging Face. Then there are tools like LangChain “that kind of glues it all together and does all the prompt adjustments.”

Aviary is essentially the back end to run something like LangChain, he explained.

“LangChain is really good for a single query, but it doesn’t really have an off-the-shelf deployment suite,” he said.

Aviary in action.

So why does this LLM stack have to be open source, especially considering the strength of OpenAI and the other big tech companies (like Google) when it comes to LLMs?

Kadous noted the downsides of LLMs owned by companies (such as OpenAI or Google), since their inner workings are often not well understood. They wanted to create a tool that would help access open source LLMs, which are more easily understood. Initially, he said, they intended to just create a comparison tool — which turned out to be the first part of Aviary. But as they worked on the project, he continued, they realized there was a significant gap in the market. There needed to be a way for developers to easily deploy, manage and maintain their chosen open source model. So that became the second half of what Aviary offers.

How a Dev Uses Aviary

Kadous explained that there are two main tasks involved in choosing and then setting up an LLM for an application. The first is comparing different LLM models, which can be done through Aviary’s frontend website, or via the command line.

Aviary currently supports nine different open source LLMs, ranging from small models with 2 billion parameters to larger ones with 30 billion parameters. He said that it took them “a fair amount of effort” to get the comparison engine up to par.

“Each one [LLM] has unique stop tokens [and] you have to kind of tailor the process a little bit,” he said. “In some cases, you can accelerate them using something like DeepSpeed, which is a library that helps to make models run faster.”

One interesting note here is that for the evaluation process, they use OpenAI’s GPT-4 (not an open source LLM!). Kadous said they chose this because it’s currently considered the most advanced model globally. The GPT-4 evaluation provides rankings and comparisons for each prompt, across whichever models were selected.

The second key task for a developer is getting the chosen model into production. The typical workflow involves downloading a model from a repository like Hugging Face. But then additional considerations arise, said Kadous, such as understanding stop tokens, implementing learning tweaks, enabling auto-scaling, and determining the required GPU specifications.

He said that Aviary simplifies the deployment process by allowing users to configure the models through a config file. The aim is to make deployment as simple as running a few command lines, he added.

Ray Serve

Aviary’s main connection with Ray, the distributed computing framework that Anyscale is best known for, is that it uses a library called Ray Serve, which is described as “a scalable model serving library for building online inference APIs.” I asked Kadous to explain how this works.

Ray Serve is specifically designed for serving machine learning models and handling model traffic, he replied. It enables the inference process, where models respond to queries. One of its benefits, he said, is its flexibility and scalability — which allows for easy service deployment and scaling from one instance to multiple instances. He added that Ray Serve incorporates cost-saving features like utilizing spot instances, which he said are significantly cheaper than on-demand instances.

Kadous noted that Ray Serve’s capabilities are particularly important when dealing with large models that require coordination across multiple machines. For example, Falcon LLM has 40 billion parameters, which necessitates running on multiple GPUs. Ray Serve leverages the Ray framework to handle the coordination between those GPUs and manage workloads distributed across multiple machines, which in turn enables Aviary to support these complex models effectively.

Customized Data Requirements

I wanted to know how a developer with a specific use case — say, someone who works for a small insurance company — might use Aviary. Can they upload insurance-related data to Aviary and test it against the models?

Kadous said that developers can engage with Anyscale and request their own customized version of Aviary, which allows them to set up a fine-tuned model. For example, an insurance company might fine-tune a model to generate responses to insurance claims. By comparing the prompts sent to the original model and the fine-tuned model, developers can assess if the fine-tuning has produced the desired differences, or if any unexpected behavior occurs.

Examples of LLM Apps

Finally, I asked Kadous what are the most impressive applications built on top of open LLMs that he’s seen so far.

He mentioned the prevalence of retrieval Q&A applications that utilize embeddings. Embeddings involve converting sentences into sequences of numbers that represent their semantic meaning, he explained. He thinks open source engines have proven to be particularly effective in generating these embeddings and creating semantic similarity.

Additionally, open source models are often used for summarizing the results obtained from retrieval-based applications, he added.

The post A New Tool for the Open Source LLM Developer Stack: Aviary appeared first on The New Stack.

Challenger to x86 RISEs to Solve the Software Problem

Agam Shah — Mon, 12 Jun 2023 18:00:16 +0000

A new chip architecture called RISC-V is emerging as a hot-ticket item to unseat the dominant x86 processor in computer hardware, but it has a problem — it has poor software support.

To solve that problem, some of the biggest names in tech have joined hands to establish RISC-V Software Ecosystem, also called RISE, which is a consortium to promote software development for the free-to-license RISC-V instruction set architecture.

The goal is to create the underlying software tools and middleware that developers will need to write applications for devices or servers with RISC-V chips.

Invoking ARM

The consortium is similar in nature to Linaro, which was formed in 2010 to boost software development for the ARM architecture, which at the time was widespread in smartphones, but was being investigated for PCs and servers. Linaro played a major role in expanding software support to make ARM a viable alternative to x86 for cloud-native applications.

Today, RISC-V is where ARM was in 2010 — an emerging architecture with limited software support, but a legitimate threat to x86 and ARM. Just like Linaro, RISE aims to create software support to boost the adoption of RISC-V chips, which is expected to pick up in the next decade.

The founding companies behind RISE include Google, Intel, Nvidia, Qualcomm, Samsung, and Ventana. These companies are using or have shown an interest in RISC-V architecture.

On the software side, Google has announced Android support for RISC-V, and developed a new OS called KataOS or RISC-V architecture. Linux and middleware distributor Red Hat is also a founding member of RISE.

Intel is diversifying outside of its homegrown x86 architecture and is embracing RISC-V and ARM processors as it prioritizes the manufacturing of chips.

What Is RISC-V?

The RISC-V instruction set architecture has been called the Linux of chips as it is free to use and license. The architecture is much leaner than standard x86 or ARM ISAs but has a Lego-like design approach to constructing chips by putting processing blocks together.

Companies can create custom chips by adding their own silicon blocks on top of the instruction set. As a result, chip designers can create leaner and more power-efficient chips with only the modules they need. Proponents argue that is a better chip design compared to the one-size-fits-all x86 chip approach of Intel and AMD. Chips based on the ARM ISA are customizable, but not to the extent of RISC-V.

About 10 billion RISC-V cores were on the market by the end of 2022, said RISC-V International, which is the organization overseeing the development of the RISC-V standard. That is a big number, but RISC-V chips have not made it to name-brand smartphones, PCs, or servers.

RISC-V has gained popularity in microcontrollers and is replacing an aging population of ARM-based controllers that dominated for decades. For example, Nvidia is using RISC-V controllers in its GPUs, and Apple is also using RISC-V controllers alongside its ARM-based chips in its Macs.

Companies such as SiFive and Alibaba have shipped high-performance development boards and chips, which are being used by enthusiasts and DIY users. The European Commission is also funding numerous research projects to create homegrown, open source RISC-V chips with the goal to cut ties with proprietary chip designs like x86 and ARM.

The Opportunity

The lack of software support is an impediment that is blocking wide adoption of RISC-V, and RISE hopes to solve that problem. A solid software stack will lift RISC-V into the mainstream computing markets.

RISC-V support is already built into Linux, with developers making upstream contributions to the kernel. RISE said it will focus on providing cohesive support for Linux distributions that run on RISC-V systems.

The RISE project will help in the creation of RISC-V binaries and applications through tools and libraries such as the open source GCC compiler and LLVM toolchain.

Options for RISC-V Linux distros are limited. Ubuntu’s Linux distro supports 93% of all packages for RISC-V and is popular among enthusiasts testing out RISC-V boards from SiFive, StarFive, and Allwinner.

RISE will focus on integration with Ubuntu/Debian, Red Hat/Fedora, and Alpine, the consortium said on its site. Developers using Ubuntu can already use interfaces such as KDE and Gnome, and development tools that include C, C++, Python, Java, OpenJDK, Node.js, and Go.

Support for the .Net environment is coming to RISC-V, said Canonical’s Heinrich Schuchardt, in an April presentation.

RISE also focuses on Python, OpenJDK/Java, V8 runtimes, the QEMU emulator, and the ACPI and UEFI system firmware. It is also focused on common system libraries such as OpenSSL and OpenBLAS.

The Problem

Nvidia, which dominates AI software development with its CUDA parallel programming framework, has said it is not looking to bring RISC-V support to its GPUs.

“We like RISC-V because is it open source… but more importantly, it’s adaptable. We can use it for all kinds of interesting configurations of CPUs. However, RISC-V is not appropriate yet and not for some time for external third-party software,” Nvidia CEO Jensen Huang said last year.

Chip makers can add their own silicon modules to the base instruction set. Those modules could be proprietary and not a standard RISC-V module, which companies can then sell as an add-on to customers. Chip makers will have to develop their own drivers, and may not get support from RISE, which will provide the base software tools to support the open source instruction set.

Breaking the X86 Chokehold

The low-hanging fruit for RISC-V is the microcontroller market, and RISE could focus initially on the IoT and embedded markets. But in the long-term goal, RISC-V wants to unseat x86 and ARM in the mobile, PC, and cloud infrastructure.

The embedded market still largely relies on the C and C++ programming languages. The software development may get complicated as it enters the PC and server markets, which will involve more customized RISC-V chips, and more programming tools, containers and virtualization technologies.

Ventana, which is also a member of RISE, has developed a high-performance RISC-V server chip called Veyron V1 that the company claims will offer performance comparable to x86 and ARM-based chips. But for Ventana to get customers, it will also have to sell a comprehensive software stack that is tuned to its custom RISC-V chips, which includes proprietary modules. The chip supports Linux and MySQL and Apache web-serving software, which is similar to the early ARM chips running the LAMP stack.

Hardware makers betting on RISC-V are already bringing OS support. Alibaba has ported Android to RISC-V, and has become an important RISC-V software contributor.

Developers are also making contributions so Firefox can run faster on RISC-V systems.

The chair of RISE is Amber Huffman, principal engineer at Google Cloud. Other notable board members include Daniel Park, who leads open source efforts at Samsung, Lars Bergstom, who is director of engineering at Google, and Mark Skarpness, who runs system software engineering at Intel.

The post Challenger to x86 RISEs to Solve the Software Problem appeared first on The New Stack.

WebAssembly and Go: A Guide to Getting Started (Part 1)

Robert Kimani — Mon, 12 Jun 2023 12:00:36 +0000

WebAssembly (Wasm) and Go are a powerful combination for building efficient and high-performance web applications. WebAssembly is a portable and efficient binary instruction format designed for web browsers, while Go is a programming language known for its simplicity, speed and concurrency features.

In this article, we will explore how WebAssembly and Go can work together to create web applications that leverage the benefits of both technologies. We will demonstrate the steps involved in compiling Go code into Wasm format, loading the resulting WebAssembly module into the browser, and enabling bidirectional communication between Go and JavaScript.

Using Go for WebAssembly offers several advantages. First, Go provides a familiar and straightforward programming environment for web developers, making it easy to transition from traditional Go development to web development.

Secondly, Go’s performance and concurrency features are well-suited for building efficient web applications that can handle heavy workloads.

Finally, the combination of Go and WebAssembly allows for cross-platform compatibility, enabling the deployment of applications on various browsers without the need for plugins or additional dependencies.

We will dive into the technical details of compiling Go code to Wasm, loading the module in a web browser, and establishing seamless communication between Go and JavaScript for WebAssembly.

You’ll come away with a comprehensive understanding of how Wasm and Go can be leveraged together to create efficient, cross-platform web applications. Whether you are a Go developer looking to explore web development or a web developer seeking high-performance options, this article will equip you with the knowledge and tools to get started with WebAssembly and Go.

Go and Its Use Cases

Go is often used for server-side development, network programming and distributed systems, but it can also be used for client-side web development.

Web development. Go is a popular choice for web development due to its simplicity, speed and efficient memory usage. It is well-suited for building backend web servers, APIs and microservices. Go’s standard library includes many built-in packages that make web development easy and efficient. Some popular web frameworks built in Go include Gin, Echo and Revel.

System programming. Go was designed with system programming in mind. It has a low-level feel and provides access to system-level features such as memory management, network programming and low-level file operations. This makes it ideal for building system-level applications such as operating systems, device drivers and network tools.

DevOps tools. Go’s simplicity and efficiency make it well-suited for building DevOps tools such as build systems, deployment tools, and monitoring software. Many popular DevOps tools are built in Go, such as Docker, Kubernetes, and Terraform.

Machine learning. Although not as popular as other programming languages for machine learning, Go’s performance and concurrency features make it a good choice for building machine learning models. It has a growing ecosystem of machine learning libraries and frameworks such as Gorgonia and Tensorflow.

Command-line tools. Go’s simplicity and fast compilation time makes it an ideal choice for building command-line tools. Go’s standard library includes many built-in packages for working with the command-line interface, such as the “flag” package for parsing command-line arguments and the “os/exec” package for executing external commands.

Key Benefits of Using WebAssembly with Go

Performance. WebAssembly is designed to be fast and efficient, which makes it an ideal choice for running computationally intensive tasks in the browser. Go is also known for its speed and efficiency, making it a good fit for building high-performance web applications.

Portability. Wasm is designed to be portable across different platforms and architectures. This means that you can compile Go code into WebAssembly format and run it on any platform that supports WebAssembly. This makes it easier to build web applications that work seamlessly across different devices and operating systems.

Security. WebAssembly provides a sandboxed environment for running code in the browser, which helps to prevent malicious code from accessing sensitive user data. Go also has built-in security features such as memory safety and type safety, which can help to prevent common security vulnerabilities.

Concurrency. Go is designed with concurrency in mind, which makes it easier to build web applications that can handle multiple requests simultaneously. By combining WebAssembly and Go, you can build web applications that are highly concurrent and can handle a large number of requests at the same time.

How WebAssembly Works with the Browser

When a Wasm module is loaded in a browser, it is executed by a virtual machine called the WebAssembly Runtime, which translates the Wasm code into machine code that the browser’s JavaScript engine can execute.

The WebAssembly Runtime is implemented in the browser as a JavaScript library and provides a set of APIs for loading, validating and executing Wasm modules. When a Wasm module is loaded, the Runtime validates the module’s bytecode and creates an instance of the module, which can be used to call its functions and access its data.

Wasm modules can interact with the browser’s Document Object Model (DOM) and other web APIs using JavaScript. For example, a Wasm module can modify the contents of a webpage, listen for user events, and make network requests using the browser’s web APIs.

One of the key benefits of using Wasm with the browser is that it provides a way to run code that is more performant than JavaScript. JavaScript is an interpreted language, which means that it can be slower than compiled languages like C++ or Go. However, by compiling code into Wasm format, it can be executed at near-native speeds, making it ideal for computationally intensive tasks such as machine learning or 3D graphics rendering.

Using WebAssembly with Go

The Go programming language has a compiler that can produce Wasm binaries, allowing Go programs to run in a web browser. The Go compiler for WebAssembly, called wasm, can be invoked using the GOARCH=wasm environment variable.

When compiling a Go program for WebAssembly, the Go compiler generates WebAssembly bytecode that can be executed in the browser using the WebAssembly Runtime. The generated Wasm module includes all of the Go runtime components needed to run the program, so no additional runtime support is required in the browser.

The Go compiler for WebAssembly supports the same set of language features as the regular Go compiler, including concurrency, garbage collection, and type safety. However, some Go features are not yet fully supported in WebAssembly, such as reflection and cgo.

Reflection. Reflection is a powerful feature in Go that allows programs to examine and manipulate their own types and values at runtime. However, due to the limitations of the Wasm runtime environment, reflection is not fully supported in Go programs compiled to WebAssembly. Some reflection capabilities may be limited or unavailable in WebAssembly binaries.

Cgo. The cgo tool in Go enables seamless integration with C code, allowing Go programs to call C functions and use C libraries. However, the cgo functionality is not currently supported in Go programs compiled to WebAssembly. This means that you cannot directly use cgo to interface with C code from WebAssembly binaries.

Technical Overview: How Wasm and Go Work Together

To compile Go code into WebAssembly format, you can use the Golang Wasm compiler. This tool generates a .wasm file that can be executed in a web browser. The compiler translates Go code into WebAssembly instructions that can be executed by a virtual machine in the browser.

Once you have the .wasm file, you need to load it into the browser using the WebAssembly JavaScript API. This API provides functions to load the module, instantiate it, and execute its functions.

You can load the .wasm file using the fetch() function, which loads the file as an ArrayBuffer. You can then instantiate the module using the WebAssembly.instantiate() function, which returns a Promise that resolves to a WebAssembly.Module object.

Calling Go Functions from JavaScript

After the WebAssembly module is loaded and instantiated, it exposes its functions to JavaScript. These functions can be called from JavaScript using the WebAssembly JavaScript API.

You can use the WebAssembly.instantiate() function to obtain a JavaScript object that contains the exported functions from the WebAssembly module. You can then call these functions from JavaScript just like any other JavaScript function.

Calling JavaScript Functions from Go

To call JavaScript functions from Go, you can use the syscall/js package. This package provides a way to interact with the JavaScript environment. You can create JavaScript values, call JavaScript functions, and handle JavaScript events from Go.

Use the js.Global() function to get the global object in the JavaScript environment. You can then call any function on this object using the Call() function, passing in the function name and any arguments.

The Golang WebAssembly API

The Golang WebAssembly API provides a set of functions that can be used to interact with WebAssembly modules from Go code running in a web browser. These functions allow Go programs to call functions defined in WebAssembly modules, pass values between Go and WebAssembly, and manipulate WebAssembly memory.

The Golang WebAssembly API is implemented as a set of Go packages, including “syscall/js,” which provides a bridge between Go and JavaScript, and “syscall/js/wasm,” which provides a bridge between Go and WebAssembly.

Using the Golang WebAssembly API, Go programs can load and instantiate Wasm modules, call functions defined in the modules, and manipulate the memory of the modules. For example, a Go program can load a Wasm module that performs complex computations, and then use the Golang WebAssembly API to call functions in the module and retrieve the results.

The Golang WebAssembly API also provides a way to define and export Go functions that can be called from WebAssembly modules. This allows Go programs to expose functionality to WebAssembly modules and provides a way to integrate Go code with existing JavaScript codebases.

Here’s a demonstration of how to compile a simple Go program to WebAssembly and load it in the browser

First, we need to install the Go compiler for WebAssembly. This can be done by running the following command:

GOARCH=wasm GOOS=js go get -u github.com/golang/go1.16.4

This will install the WebAssembly-enabled version of the Go compiler.

Next, we can write a simple Go program that adds two numbers together:

package main

import "fmt"

func add(a, b int) int {

  return a + b
}


func main() {
  fmt.Println("Hello from Go!")
}

We can then compile this program to WebAssembly by running the following command:

GOARCH=wasm GOOS=js go build -o add.wasm

This will generate a WebAssembly binary file called “add.wasm.”

Now we can write some JavaScript code to load and execute the WebAssembly module. Here’s an example

const go = new Go();

WebAssembly.instantiateStreaming(fetch('add.wasm'), 
go.importObject).then((result) => {
  go.run(result.instance);
  console.log("Result:", add(2, 3)); // call the 'add' function defined in the Go program
});

This code creates a new instance of the Go WebAssembly API, loads the add.wasm module using the WebAssembly API, runs the module, and then calls the add function defined in the Go program.

Finally, we can load our JavaScript code in a webpage and view the output in the browser console. For example:



 
   
   Go + WebAssembly Example

This HTML file loads the wasm_exec.js file, which is included with the Go compiler for WebAssembly, and then includes our JavaScript code to load and execute the add.wasm module.

That’s it! With these steps, we can compile a simple Go program to WebAssembly and load it in a web browser using JavaScript. This provides a powerful way to build high-performance web applications with the simplicity and ease of use of the Go programming language.

How to Use Go with Various Wasm Frameworks

Here’s an overview of different WebAssembly frameworks that can be used with Go, including AssemblyScript (a TypeScript-like language that compiles to Wasm) and TinyGo (a variant of Go that compiles to WebAssembly and other embedded systems).

AssemblyScript

AssemblyScript provides a familiar syntax for web developers and can be used alongside Go to provide additional functionality to a web application. Here’s an example of how to use Go with AssemblyScript:

import * as go from "go";

const wasmModule = new WebAssembly.Module(await f
etch('add.wasm').then(response => response.arrayBuffer()));
const wasmInstance = new WebAssembly.Instance(wasmModule, go.importObject)

console.log(wasmInstance.exports.add(2, 3)); // Call the 'add' function defined in the Wasm module

await go.run(wasmInstance); // Start the Go runtime and call Go functions from JavaScript

In this example, we load the add.wasm module using the WebAssembly API and instantiate it with the Go import object. We then call the add function defined in the WebAssembly module and pass it two parameters. Finally, we start the Go runtime and call Go functions from JavaScript.

TinyGo

TinyGo provides a subset of the Go standard library and can be used to write low-level code that runs in the browser. Here’s an example of how to use TinyGo to call a function defined in a Go WebAssembly module:

package main

import "syscall/js"

func add(this js.Value, inputs []js.Value) interface{} {
  a := inputs[0].Int()
  b := inputs[1].Int()
  return a + b
}

func main() {
  c := make(chan struct{}, 0)
  js.Global().Set("add", js.FuncOf(add))
  <-c
}

In this example, we define a function called add that takes two integer parameters and returns their sum. We then use the “syscall/js” package to export this function to JavaScript. Finally, we block the main thread using a channel to prevent the Go program from exiting.

We can then call this function from JavaScript using the following code:

const wasmModule = await WebAssembly.instantiateStreaming(fetch('add.wasm'), 
go.importObject);
const go = new Go();


WebAssembly.instantiateStreaming(fetch('add.wasm'), 
go.importObject).then((result) => {
   go.run(result.instance);
   console.log("Result:", add(2, 3)); // call the 'add' function defined in the Go program
});

In this example, we instantiate the WebAssembly module and pass it to the Go runtime using the Go import object. We then run the Go program and call the add function defined in the Go program. The result is then printed to the console.

Using Wasm for Cross-Platform Development

WebAssembly code can be run in any environment that supports it, including browsers and standalone runtimes. Developers can use it to create applications that can run on multiple platforms with minimal code changes — fulfilling WebAssembly’s promise of “build once, run anywhere.” This can help to reduce development time and costs, while also providing a consistent user experience across different devices and platforms.

One way to use Wasm for cross-platform development is to build an application in a language that can be compiled to WebAssembly, such as Go or Rust. Once the application is built, it can be compiled to WebAssembly and deployed to the web, or compiled to native code and deployed to a desktop environment, using a framework like Electron or GTK.

Another way to use Wasm for cross-platform development is to build an application in a web-based language like JavaScript, and then compile it to WebAssembly using a tool like Emscripten. This approach can be especially useful for porting existing web applications to run on desktop environments, or for building applications that need to run on both the web and desktop.

Go programs can be compiled to both WebAssembly and native desktop environments using a number of different tools and frameworks.

For example, Electron is a popular framework for building cross-platform desktop applications using web technologies like HTML, CSS, and JavaScript. Go programs can be compiled to run on Electron using a tool like Go-Electron, which provides a way to package Go applications as Electron apps.

Another option is to use GTK, a popular cross-platform toolkit for building desktop applications. Go programs can be compiled to run on GTK using the gotk3 package, which provides Go bindings for GTK.

The post WebAssembly and Go: A Guide to Getting Started (Part 1) appeared first on The New Stack.

WebAssembly and Go: A Guide to Getting Started (Part 2)

Robert Kimani — Mon, 12 Jun 2023 12:00:13 +0000

WebAssembly (Wasm) and Golang (Go) are a dynamic duo for high-performance web applications due to their specific features and advantages. Wasm is a binary instruction format that allows running code at near-native speed in modern web browsers. It provides a low-level virtual machine that enables efficient execution of code, making it ideal for performance-intensive tasks.

Go is a statically typed, compiled programming language known for its simplicity, efficiency and high-performance characteristics. It offers built-in concurrency support, efficient memory management, and excellent execution speed. These qualities make Go a suitable language for developing backend systems that power web applications.

By combining WebAssembly and Go, developers can achieve exceptional performance in web applications. Go can be used to write backend services, APIs and business logic, while WebAssembly can be used to execute performance-critical code in the browser. This combination allows for offloading computation to the client-side, reducing server load and improving responsiveness.

Furthermore, Go has excellent interoperability with WebAssembly, allowing seamless integration between the two. Developers can compile Go code to WebAssembly modules, which can be executed in the browser alongside JavaScript, enabling the utilization of Go’s performance benefits on the client side.

Performance is of paramount importance in web applications for several reasons:

User experience. A fast and responsive web application enhances the user experience and satisfaction. Users expect web pages to load quickly and respond promptly to their interactions. Slow and sluggish applications can lead to frustration, abandonment and loss of users.

Conversion rates. Performance directly impacts conversion rates, especially in e-commerce and online businesses. Even minor delays in page load times can result in higher bounce rates and lower conversion rates, studies have shown. Improved performance can lead to increased engagement, longer session durations and higher conversion rates.

Search Engine Optimization (SEO). Search engines, like Google, take website performance into account when ranking search results. Faster-loading websites tend to have better search engine rankings, which can significantly impact organic traffic and visibility.

Mobile users. With the increasing use of mobile devices, performance becomes even more critical. Mobile networks can be slower and less reliable than fixed-line connections. Optimizing web application performance ensures a smooth experience for mobile users, leading to better engagement and retention.

Competitiveness. In today’s highly competitive digital landscape, performance can be a key differentiator. Users have numerous options available, and if your application is slow, they may switch to a competitor offering a faster and more efficient experience.

How Wasm Enhances Web Application Performance

Near-native performance. WebAssembly is designed to execute code at near-native speed. It achieves this by using a compact binary format that can be efficiently decoded and executed by modern web browsers. Unlike traditional web technologies like JavaScript, which are interpreted at runtime, Wasm code is compiled ahead of time and can be executed directly by the browser’s virtual machine, resulting in faster execution times.

Efficient execution. WebAssembly provides a low-level virtual machine that allows for efficient execution of code. It uses a stack-based architecture that minimizes the overhead associated with memory access and function calls. Additionally, WebAssembly operates on a compact binary format, reducing the size of the transmitted code and improving load times.

Multilanguage support. WebAssembly is designed to be language-agnostic, which means it can be used with a wide range of programming languages. This allows developers to leverage the performance benefits of Wasm while using their preferred programming language. By compiling code from languages like C, C++, Rust, and Go to WebAssembly, developers can take advantage of their performance characteristics and seamlessly integrate them into web applications.

Offloading computation. Wasm enables offloading computationally intensive tasks from the server to the client side. By moving certain operations to the browser, web applications can reduce the load on the server, distribute computation across multiple devices and improve overall responsiveness. This can be particularly beneficial for applications that involve complex calculations, image processing, simulations and other performance-critical tasks.

Seamless integration with JavaScript. WebAssembly can easily integrate with JavaScript, the traditional language of the web. This allows developers to combine the performance benefits of Wasm with the rich ecosystem of JavaScript libraries and frameworks. WebAssembly modules can be imported and exported from JavaScript code, enabling interoperability and smooth interaction between the two.

Progressive enhancement. Wasm supports a progressive enhancement approach to web development. Developers can choose to compile performance-critical parts of their application to WebAssembly while keeping the rest of the code in JavaScript. This way, the performance gains are selectively applied where they are most needed, without requiring a complete rewrite of the entire application.

WebAssembly vs. Other Web Technologies

WebAssembly outperforms JavaScript and asm.js in terms of execution speed. JavaScript is an interpreted language, while asm.js is a subset of JavaScript optimized for performance.

In contrast, WebAssembly executes at near-native speed, thanks to its efficient binary format and ahead-of-time (AOT) compilation. Wasm is language-agnostic, allowing developers to use multiple languages.

JavaScript has a larger developer community and mature tooling, while asm.js requires specific optimizations. WebAssembly binaries are smaller, resulting in faster load times. JavaScript has wider browser compatibility and seamless interoperability with web technologies.

WebAssembly requires explicit interfaces for interaction with JavaScript. Overall, Wasm offers high performance, while JavaScript has wider adoption and tooling support. Usage of asm.js has diminished with the rise of WebAssembly. The choice depends on performance needs, language preferences and browser support.

How Go Helps Create High-Performance Apps

Go is known for its key features that contribute to building high-performance applications. These features include:

Compiled language. Go compiles source code into efficient machine code, which results in fast execution and eliminates the need for interpretation at runtime. The compiled binaries can be directly executed by the operating system, providing excellent performance.

Concurrency support. The language has built-in support for concurrency through goroutines and channels. Goroutines are lightweight threads that allow concurrent execution of functions, while channels facilitate communication and synchronization between goroutines.

This concurrency model makes it easy to write highly concurrent and parallel programs, enabling efficient use of available resources and improving performance in scenarios like handling multiple requests or processing large amounts of data concurrently.

Efficient garbage collection, Go incorporates a garbage collector that automatically manages memory allocation and deallocation. It uses a concurrent garbage collector that minimizes pauses and allows applications to run smoothly without significant interruptions. The garbage collector efficiently reclaims unused memory, preventing memory leaks and enabling efficient memory management in high-performance applications.

Strong standard library. Go comes with a rich standard library that provides a wide range of functionalities, including networking, file I/O, encryption, concurrency primitives and more. The standard library is designed with performance and efficiency in mind, offering optimized implementations and well-designed APIs.

Developers can leverage these libraries to build high-performance applications without relying heavily on third-party dependencies.

Native support for concurrency patterns. Go provides native support for common concurrency patterns, such as mutexes, condition variables and atomic operations. These features enable developers to write thread-safe and efficient concurrent code without the complexities typically associated with low-level synchronization primitives.

This native support simplifies the development of concurrent applications and contributes to improved performance.

Efficient networking. Golang’s standard library includes a powerful networking package that offers efficient abstractions for building networked applications. It provides a robust set of tools for handling TCP/IP, UDP, HTTP, and other protocols. The networking capabilities of Go are designed to be performant, enabling the development of high-throughput and low-latency network applications.

Compilation to standalone binaries. Go can compile code into standalone binaries that contain all the necessary dependencies and libraries. These binaries can be easily deployed and executed on various platforms without requiring the installation of additional dependencies.

This approach simplifies deployment and can contribute to better performance by reducing overhead and ensuring consistent execution environments.

Using Wasm for Computationally Intensive Tasks

Wasm can greatly improve the performance of computationally intensive tasks like image processing or cryptography by leveraging its near-native execution speed. By compiling algorithms or libraries written in languages like C/C++ or Rust to WebAssembly, developers can achieve significant performance gains.

WebAssembly’s efficient binary format and ability to execute in a sandboxed environment make it ideal for running computationally intensive operations in the browser.

Go programs can benefit from improved performance when compiled to Wasm for computationally intensive tasks. For example, Go libraries or applications that involve heavy image manipulation, complex mathematical calculations or cryptographic operations can be compiled to WebAssembly to take advantage of its speed.

Using WebAssembly for UI Rendering

WebAssembly can improve UI rendering performance in the browser compared to traditional JavaScript approaches. By leveraging Wasm’s efficient execution and direct access to low-level operations, rendering engines can achieve faster updates and smoother animations.

WebAssembly allows UI rendering code to run closer to native speeds, resulting in improved user experiences, especially for complex or graphically intensive applications.

UI frameworks or libraries like React or Vue.js can benefit from improved performance when compiled to WebAssembly. By leveraging the speed and efficiency of Wasm, these frameworks can deliver faster rendering and more responsive user interfaces. Compiling UI components written in languages like Rust or C++ to WebAssembly can enhance the overall performance and responsiveness of the UI, making the user experience more seamless and interactive.

Using WebAssembly for Game Development

WebAssembly’s efficient execution and direct access to hardware resources make it ideal for browser-based game development. It offers improved performance compared to traditional JavaScript game engines. By compiling game logic and rendering code to WebAssembly, developers can achieve near-native speeds, enabling complex and visually rich games to run smoothly in the browser.

Go-based game engines like Azul3D can benefit from improved performance when compiled to WebAssembly. By leveraging the speed and efficiency of Wasm, Go game engines can deliver high-performance browser games with advanced graphics and physics simulations.

Compiling Go-based game engines to WebAssembly enables developers to harness Go’s performance characteristics and create immersive gaming experiences that rival native applications.

The Power of Go and WebAssembly: Case Studies

TinyGo

TinyGo is a project that compiles Go code to WebAssembly for running on resource-constrained devices and in the browser. It showcases the performance gains of combining Go with Wasm for scenarios where efficiency and low resource usage are crucial.

Wasmer

Wasmer is an open-source runtime for executing WebAssembly outside the browser. It supports running Go code as WebAssembly modules. Wasmer’s performance benchmarks have demonstrated that Go code executed as Wasm can achieve comparable or better performance than JavaScript in various scenarios.

Vecty

Vecty is a web framework for building responsive and dynamic frontends in Go using WebAssembly. It aims to compete with modern web frameworks like React and VueJS. Here are some key features of Vecty:

Simplicity. Vecty is designed to be easily mastered by newcomers, especially those familiar with the Go programming language. It follows Go’s philosophy of simplicity and readability.
Performance. Vecty focuses on providing efficient and understandable performance. It aims to generate small bundle sizes, resulting in faster loading times for your web applications. Vecty strives to achieve the same performance as raw JavaScript, HTML and CSS.
Composability. Vecty allows you to nest components, enabling you to build complex user interfaces by logically separating them into smaller, reusable packages. This composability promotes code reusability and maintainability.
Designed for Go. Vecty is specifically designed for Go developers. Instead of translating popular libraries from other languages to Go, Vecty was built from the ground up, asking the question, “What is the best way to solve this problem in Go?” This approach ensures that Vecty leverages Go’s unique strengths and idioms.

Best Practices: Developing Web Apps with Wasm and Go

Optimize Go Code for WebAssembly

Minimize memory allocations. Excessive memory allocations can impact performance. Consider using object pooling or reusing memory to reduce the frequency of allocations and deallocations.

Use efficient data structures. Choose data structures that are optimized for performance. Go provides various built-in data structures like slices and maps that are efficient for most use cases.

Limit garbage collection pressure. Excessive garbage collection can introduce pauses and affect performance. Minimize unnecessary object allocations and use the appropriate garbage collection settings to optimize memory management.

Optimize loops and iterations. Identify loops and iterations that can be optimized. Use loop unrolling, minimize unnecessary calculations and ensure efficient memory access patterns.

Leverage goroutines and channels. Go’s concurrency primitives, goroutines, and channels, can help maximize performance. Use them to parallelize tasks and efficiently handle concurrent operations.

Maximize Performance in Wasm Modules

Minimize startup overhead. Reduce the size of the WebAssembly module by eliminating unnecessary code and dependencies. Minify and compress the module to minimize download time.

Optimize data transfers. Minimize data transfers between JavaScript and Wasm modules. Use efficient memory layouts and data representations to reduce serialization and deserialization overhead.

Use SIMD instructions. If applicable, use single instruction, multiple data (SIMD) instructions to perform parallel computations and improve performance. SIMD can be especially beneficial for tasks involving vector operations.

Profile and optimize performance-critical code. Identify performance bottlenecks by profiling the WebAssembly module. Optimize the hot paths, critical functions and sections that consume significant resources to improve overall performance.

Use compiler and optimization flags. Use compiler-specific flags and optimizations tailored for WebAssembly. Different compilers may have specific optimizations to improve performance for Wasm targets.

Minimize Latency and Improve Responsiveness

Reduce round trips. Minimize the number of network requests by combining resources, utilizing caching mechanisms, and employing efficient data transfer protocols like HTTP/2 or WebSockets.

Do asynchronous operations. Use asynchronous programming techniques to avoid blocking the main thread and enhance responsiveness. Employ callbacks, Promises, or async/await syntax for non-blocking I/O operations.

Employ lazy loading and code splitting. Divide the application into smaller modules and load them on-demand as needed. Lazy loading and code splitting reduce the initial load time and improve perceived performance.

Use efficient DOM manipulation. Optimize Document Object Model (DOM) manipulation operations by batching changes and reducing layout recalculations. Use techniques like virtual DOM diffing to minimize updates and optimize rendering.

Rely on caching and prefetching. Leverage browser caching mechanisms and prefetching to proactively load resources that are likely to be needed, reducing latency and improving perceived performance.

The post WebAssembly and Go: A Guide to Getting Started (Part 2) appeared first on The New Stack.

Dev News: Apollo Drama, Monster API and Mobile App Discontent

Loraine Lawson — Sat, 10 Jun 2023 13:00:04 +0000

Condolences to Apollo, the popular Reddit app, which will be shutting down June 30. The gritty details were posted on Reddit, of course, but essentially app developer Christian Selig blamed Reddit’s API price increase. Selling said it was a 20x price increase, amounting to approximately $2.50 per month per user. At that price with the apps current usage, it would cost almost $2 million per month or over $20 million per year, Selig claimed.

Apparently, things got ugly between Reddit and Selig, who offers a link to an audio of a call with Reddit and links to a screenshot of a Mastodon poster accusing him of attempting to blackmail Reddit. It seems Reddit also accused the site of scrapping, leading to Selig to post the app’s backend code on Github. The price increase also led to other turmoil for Reddit, including news of a subreddit strike planned for Monday.

Monster API Platform to Simplify AI

A new company called Monster API launched its platform this week. It’s designed to give developers access to graphics processing units (GPUs) infrastructure and pre-trained artificial intelligence models at a lower cost than other cloud-based options, according to the press statement.

Monster API uses decentralized computing to allow developers to create generative AI applications. The new platform allows developers to access AI models such as Stable Diffusion, Whisper AI and StableLM “out-of-the-box.”

Monster API’s full stack includes an optimization layer, a compute orchestrator, a massive GPU infrastructure, and ready-to-use inference APIs. It also supports fine tuning large language models such as LLaMA and StableLM.

“We eliminate the need to worry about GPU infrastructure, containerization, setting up a Kubernetes cluster, and managing scalable API deployments as well as offering the benefits of lower costs,” said Sarah Vin, CEO and co-founder of the company. “One early customer has saved over $300,000 by shifting their ML workloads from AWS to Monster API’s distributed GPU infrastructure.”

Monster API is the collaboration of two brothers, Saurabh Vij and Gaurav Vij. Gaurav, who faced a significant challenge for his startup when his AWS bill skyrocketed, according to the company statement. In parallel, Saurabh, formerly a particle physicist at CERN (European Council for Nuclear Research), recognized the potential of distributed computing in projects. Inspired by these experiences, the brothers sought to harness the computing power of consumer devices like PlayStations, gaming PCs, and crypto mining rigs for training ML models.

After multiple iterations, they successfully optimized GPUs for ML workloads, leading to a 90% reduction in Gaurav’s monthly bills.

The company promises a predictable API bill versus the current pay by GPU time. Its APIs also scale automatically to handle increased demand, from one to 100 GPUs. The company also announced $1.1 million in pre-seed funding this week.

The Mobile Release of Our Discontent

A majority of companies “are not happy” with how often they release new versions of their mobile apps, according to a survey of 1,600 companies conducted by Bitrise, a mobile DevOps platform. Sixty-two percent of teams said that their release frequency is “unsatisfactory.”

The survey found React Native is the most popular cross-platform framework, used by 48.33% of teams, followed by Flutter at 37.5%. When it comes to testing, only 10.4% of teams said they test as many devices as possible with a device farm, and 31% reported they test the most commonly used devices in their user base.

It also found that 25.7% of teams don’t have the features and functionality of their iOS and Android apps in sync.

Bitrise is also proposing a benchmark for the mobile app market similar to Google’s DORA metrics, which it calls MODAS: the Mobile DevOps Assessment. MODAS uses five key performance metrics for apps:

Creation
Testing
Deployment
Monitoring
Collaboration

The study also links to a number of online case studies about mobile speed, noting that when it comes to “mobile app iterations for example, speed is everything: there is a strong correlation between the frequency of updates and the ranking in the app stores.”

The post Dev News: Apollo Drama, Monster API and Mobile App Discontent appeared first on The New Stack.

Vector Databases: What Devs Need to Know about How They Work

David Eastman — Sat, 10 Jun 2023 12:00:27 +0000

When we say “database” today we are probably talking about persistent storage, relational tables, and SQL. Rows and Columns, and all that stuff. Many of the concepts were designed to pack data into what was, at the time they were created, limited hard disk space. But most of the things we store and search for are still just numbers or strings. And while dealing with strings is clearly a little more complex than dealing with numbers, we generally only need an exact match — or maybe a simply defined fuzzy pattern.

This post looks at the slightly different challenges to traditional tooling that AI brings. The journey starts with a previous attempt to emulate modern AI, by creating a Shakespeare sonnet.

We analyzed a corpus and tried predicting words, a trick played to perfection by ChatGPT. We recorded the distance words appeared from each other. And we used this distance data to guess similar words based on their distances to the same word.

So in the above, if we were to have only two phrases in our corpus, then the word following “Beware” could be “the” or “of”. But why couldn’t we produce ChatGPT-level sonnets? My process was just the equivalent of a couple of dimensions of training data. There was no full model as such, and no neural network.

What we did was a somewhat limited attempt to turn words into something numerical, and thus computable. This is largely what a word embedding is. Either way, we end up with a set of numbers — aka a vector.

At school we remember vectors having magnitude and direction, so they could be used to plot an airplane’s course and speed, for example. But a vector can have any amount of numbers or dimensions attached to it:

x=(x₁, x₂, x₃, … ,x₉)

Obviously, this can no longer be placed neatly in physical space, though I welcome any n-dimensional beings who happen to be reading this post.

By reading lots of texts and comparing words, vectors can be created that will approximate characteristics like the semantic relationship of the word, definitions, context, etc. For example, reading fantasy literature I might see very similar uses of “King” and “Queen”:

Word2Vec explained

The values here are arbitrary of course. But we can start to think about doing vector maths, and understand how we can navigate with these vectors:

King - Man + Woman = Queen

[5,3] - [2,1] + [3, 2] = [6,4]

The trick is to imagine not just two, but a vector of many, many dimensions. The Word2Vec algorithm uses a neural network model to learn word associations like this with a large corpus of text. Once trained, such a model can detect similar words:

Given a large enough dataset, Word2Vec can make strong estimates about a word’s meaning based on their occurrences in the text.

Using neural network training methods, we can start to both produce more vectors and improve our model’s ability to predict the next word. The network translates the “lessons” provided by the corpus into a layer within vector space that reliably “predicts” similar examples. You can train on what target word is missing in a set of words, or you can train on what words are around a target word.

The common use of Shakespeare shouldn’t be seen as some form of elite validation of the Bard’s ownership of language. It is just a very large set of accurately recorded words that we all recognize as consistent English and within the context of one man’s endeavors. This matters, because whenever he says “King” or “Queen” he retains the same perspective. If he was suddenly talking about chess pieces, then the context for “Knight” would be quite different — however valid.

Any large set of data can be used to extract meaning. For example, we can look at tweets about the latest film “Spider-Man: Across the Spiderverse,” which has generally been well-reviewed, by those who would be likely to comment or see it:

“That was a beautiful movie.”

“The best animation ever, I’m sorry but it’s true, only 2 or 3 movies are equal to this work of art.”

“It really was peak.”

“..is a film made with LOVE. Every scene, every frame was made with LOVE.”

“We love this film with all our hearts.”

But you can begin to see that millennial mannerisms mixed with Gen Z expressions, while all valid, might cause some problems. The corpus needs to be sufficiently large that there would be natural comparisons within the data, so that one type of voice didn’t become an outlier.

Obviously, if you wanted to train a movie comparison site, these are the embeddings you would want to look at.

Ok, so we now have an idea of what word embeddings are in terms of vectors. Let’s generalize to vector embeddings, and imagine using sentences instead of single words, or pixel values to construct images. As long as we can convert from data items to vectors, the same methods apply.

In summary:

Models help generate vector embeddings.
Neural networks train these models.

What a Vector Database Does

Unsurprisingly, a vector database deals with vector embeddings. We can already perceive that dealing with vectors is not going to be the same as just dealing with scalar quantities (i.e. just normal numbers that express a value or magnitude).

The queries we deal with in traditional relational tables normally match values in a given row exactly. A vector database interrogates the same space as the model which generated the embeddings. The aim is usually to find similar vectors. So initially, we add the generated vector embeddings into the database.

As the results are not exact matches, there is a natural trade-off between accuracy and speed. And this is where the individual vendors make their pitch. Like traditional databases, there is also some work to be done on indexing vectors for efficiency, and post-processing to impose an order on results.

Indexing is a way to improve efficiency as well as to focus on properties that are relevant in the search, paring down large vectors. Trying to accurately represent something big with a much smaller key is a common strategy in computing; we saw this when looking at hashing.

Working out the meaning of “similar” is clearly an issue when dealing with a bunch of numbers that stand in for something else. Algorithms for this are referred to as similarity measures. Even in a simple vector, like for an airplane, you have to decide whether two planes heading in the same direction but some distance away are more or less similar to two planes close to each other but with different destinations.

Learning from Tradition

The final consideration is leveraging experience from traditional databases — there are plenty of them to learn from. So for fault tolerance, vector databases can use replication or sharding, and face the same issues between strong and eventual consistency.

Common sense suggests that there will be strategic combinations of traditional vendors and niche players, so that these methods can be reliably applied to the new data that the AI explosion will be producing. So a vector database is yet another of the new and strange beasts that should become more familiar as AI continues to be exploited.

The post Vector Databases: What Devs Need to Know about How They Work appeared first on The New Stack.

Winglang: Cloud Development Programming for the AI Era

Shai Ber — Fri, 09 Jun 2023 17:00:22 +0000

As long as AI serves as a co-pilot rather than an auto-pilot, there’s room for a language that facilitates effective collaboration between humans and AI. This can be achieved by reducing cognitive load and enabling rapid testing, significantly cutting iteration times. Moreover, AI simplifies the adoption of new languages.

So, why invest time and effort in the development of a new programming language (for humans) today when AI is rapidly advancing and taking over more coding tasks?

I often encounter this question in various forms:

Won’t AI eventually write machine code directly, rendering programming languages obsolete?
Can a new language introduce features or capabilities that AI cannot achieve using existing languages? (For example, why create a cloud-portable language when AI can write code for a specific cloud and then rewrite it for another?)
Is it worthwhile to create tools for developers who might soon be replaced by AI?

First, I must admit that I cannot predict the pace of AI advancement. Reputable experts hold differing opinions on when, or if, AI will replace human developers.

However, even if AI does eventually replace human developers, it may not necessarily write machine code directly. Why would an AI choose to reinvent the wheel for each app by writing machine code directly when it can rely on proven abstraction layers and compilers, allowing it to efficiently focus on the unique aspects of the business it serves? By building on existing work and focusing on smaller, simpler tasks, the AI can yield faster, higher-quality results.

Having covered the more distant future, I now want to focus on the more immediate future for the remainder of this post.

I believe that, given human limitations and psychology, change will likely be gradual despite AI’s rapid progress, leading to a significant transitional period with humans remaining in the loop. For instance, it’s hard to imagine organizations not desiring a human to be accountable for the AI’s output. That human would be very reluctant to let the AI do its work in a way that the human cannot understand, modify and maintain.

Think about it, would you let ChatGPT write a professional article for your peers, in your name, in a language you don’t speak? Would you publish it without being able to read it? Probably not. Similarly, would an engineering manager release a mission-critical app to production knowing that it was written by AI in a way that would make it hard for humans to step in if something goes wrong?

Additionally, while it is true that AI is an equalizer between tools to some degree, it still doesn’t completely solve the problem. Let’s take the cloud portability example from above: Even if the AI can port my code between clouds, I still want to be able to read and modify it. As a result, I must become an expert in all these clouds at the level of abstraction the AI used. If a new language allows it to write at a higher level of abstraction, it will be easier for me to understand and modify it too.

Assuming AI empowers us to rapidly generate vast amounts of code, the bottleneck inevitably shifts to the testing and validation phase. This occurs not solely due to AI’s inherent limitations, but primarily because of our own imperfections as humans. We are incapable of flawlessly articulating our requirements, which necessitates experiencing a working version of the end product, interacting with it and determining whether it fulfills our needs or if we’ve overlooked any edge cases. This iterative process continues until our creation reaches perfection.

In a landscape where testing and validation consume the majority of software delivery time, there is ample opportunity for tools that significantly streamline this phase. By reducing the time required to deploy and evaluate an application within a development environment, these tools can greatly enhance overall efficiency.

Therefore, I believe that for the foreseeable future, there is room for tools that make it easier for both humans and AI to write quality code swiftly, collaborate effectively and test more rapidly. Such tools will allow us to enhance the quality and speed of our application delivery.

The Key: Reducing Cognitive Load and Accelerating Iteration

Whether you’re an AI or a human developer, reducing complexity and iterating faster will result in better applications developed more quickly.

So, what can be done to make these improvements?

Working at a Higher Level of Abstraction

Utilizing a higher level of abstraction offers the following benefits for both human and AI coders:

Reduces cognitive load for human developers by focusing on the app’s business logic instead of implementation details. This enables developers to concentrate on a smaller problem (e.g., instructing a car to turn right rather than teaching it how to do so), deal with fewer levels of the stack, write less code and minimize the surface area for errors.
Reduces cognitive load for AI. This concept may need further clarification. AI systems come pretrained with knowledge of all levels of the stack, so knowing less is not a significant advantage. Focusing on a smaller problem is also not as beneficial as it would be for a human, because as long as the AI knows how to instruct the car to turn, it shouldn’t have an issue teaching it how to do so instead of just telling it to turn. But it’s still advantageous, as explained above, since it reduces the problem surface, allowing the AI to generate the code faster and at a higher quality. However, allowing the AI to write less code and reducing the chance for it to make mistakes is highly beneficial, as AI is far from infallible. Anyone who has witnessed it hallucinate interfaces or generate disconnected code can attest to this. Furthermore, AI is constrained by the amount of code it can generate before losing context. So writing less code enables AI coders to create larger and more complex parts of applications.
Accelerates iteration speed because it requires writing less code, reducing the time it takes to write and maintain it. While it might not seem intuitive, this is equally important for both human and AI coders, as AI generates code one token at a time, similar to how a human writes.
Improves collaboration between human and AI coders. A smaller codebase written at a higher level of abstraction allows human developers to understand, modify and maintain AI-generated code more quickly and easily, resulting in higher-quality code that is developed faster.

Faster Deployment and Testing

Presently, deploying and testing cloud applications can take several minutes. When multiplied by numerous iteration cycles, there’s substantial potential for improvement. Particularly, as our AI friends assist us to accelerate code writing, the proportion of time spent on testing and validation within each iteration cycle becomes increasingly significant compared to code writing.

A prevalent workaround is running tests locally, bypassing cloud deployment. However, this approach presents its own challenges, as it necessitates simulating the cloud environment surrounding the tested components. Consequently, these tests are constrained in their scope, often requiring supplementary tests that run in the cloud to confirm code functionality within the actual environment.

Yet this is not the end of the journey. Such solutions primarily cater to automatic tests, while developers frequently desire manual interaction with applications during development or seek feedback from various stakeholders (product, sales, management, potential users, etc.). Achieving this without cloud deployment and its associated time penalties remains a challenge.

Hence, we need to be able to generate tests that can operate both locally and in the cloud and be executed swiftly. Additionally, we must enable rapid deployment of cloud applications and facilitate easy access for stakeholder validation.

By achieving this, we can significantly enhance iteration speeds, regardless of whether the code is created by AI, humans or is a collaborative effort.

So, how do we bring this vision to life?

Introducing Wing

Wing is a new programming language for cloud development that enables both human and AI developers to write cloud code at a higher level of abstraction, and it comes with a local simulator that lets developers test it quickly.

Quantifying the Improvement

As we’ll demonstrate below, we’re talking about a 90% – 95% reduction in code, and orders of magnitude increase in testing speeds.

Let’s See Some Code

Here’s an example of a small app that uploads a file to a bucket (think AWS S3, Azure Blob Storage or GCP Bucket) using a cloud function (AWS Lambda, Azure Function or GCP Cloud Function).

This is the code in Wing:

bring cloud;

let bucket = new cloud.Bucket();
new cloud.Function(inflight () => {
  bucket.put("hello.txt", "world!");

});

As you can see, whether a human or an AI coder is writing Wing code, they work at a high level of abstraction, enabling the Wing compiler to handle underlying cloud mechanics such as IAM policies and networking (don’t worry, it’s customizable and extensible, ensuring you maintain control when needed).

Unlike human and AI coders, the compiler is infallible. Additionally, it is faster, deterministic and doesn’t lose context over time. As a result, the more responsibilities we delegate to the compiler instead of humans or AI, the better the outcomes.

The compiler can adapt the app for any cloud provider, necessitating that humans only need to know and maintain the higher-level, cloud-agnostic code. The generated compilation artifacts, Terraform and JavaScript, can be deployed using proven, dependable tools.

Now let’s take a look at the same code in one of the leading cloud development stacks today — Terraform + JavaScript.

main.tf:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 3.0"
    }
  }
}

provider "aws" {
  region = "us-west-2"
}

locals {
  lambda_function_name = "upload_hello_txt_lambda"
}

resource "aws_s3_bucket" "this" {
  bucket = "my-s3-bucket"
  acl    = "private"
}

data "archive_file" "lambda_zip" {
  type        = "zip"
  source_file = "index.js"
  output_path = "${path.module}/lambda.zip"
}

resource "aws_lambda_function" "this" {
  function_name = local.lambda_function_name
  role          = aws_iam_role.lambda_role.arn
  handler       = "index.handler"
  runtime       = "nodejs14.x"
  filename      = data.archive_file.lambda_zip.output_path
  timeout       = 10

  environment {
    variables = {
      BUCKET_NAME = aws_s3_bucket.this.bucket
    }
  }
}

resource "aws_iam_role" "lambda_role" {
  name = "lambda_role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "lambda.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy" "lambda_policy" {
  name = "lambda_policy"
  role = aws_iam_role.lambda_role.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "logs:CreateLogGroup",
          "logs:CreateLogStream",
          "logs:PutLogEvents"
        ]

        Effect   = "Allow"
        Resource = "arn:aws:logs:*:*:*"
      },

      {
        Action = [
          "s3:PutObject"
        ]

        Effect   = "Allow"
        Resource = "${aws_s3_bucket.this.arn}/*"
      }
    ]
  })
}

output "bucket_name" {
  value = aws_s3_bucket.this.bucket
}

output "lambda_function_name" {
  value = aws_lambda_function.this.function_name
}

index.js:
const AWS = require('aws-sdk');
const S3 = new AWS.S3();

exports.handler = async (event) => {
  const bucketName = process.env.BUCKET_NAME;
  const key = 'hello.txt';
  const content = 'Hello world!';
  const params = {
    Bucket: bucketName,
    Key: key,
    Body: content,
  };

  try {
    await S3.putObject(params).promise();
    return {
      statusCode: 200,
      body: JSON.stringify('File uploaded successfully.'),
    };

  } catch (error) {
    console.error(error);
    return {
      statusCode: 500,
      body: JSON.stringify('Error uploading the file.'),
    };

  }

};

As you can see, the Wing code is seven lines long, while the Terraform and JavaScript code is 122 lines, or ±17X more code. Not only that, it dives deeply into lower layers of the cloud stack.

You might be wondering if there are newer solutions against which Wing’s gains are less significant, or if the same results can be achieved through a library or a language extension. You can see how Wing compares to other solutions and why it’s a new language rather than some other solution here.

Testing with Wing

Wing comes out of the box with a local simulator and a visualization and debugging console.

These tools enable developers to work on their code with near-instant hot-reloading and test cloud applications very easily without having to mock the cloud around them.

In the example of our very simple app above, deploying to any cloud provider in order to run tests would take close to a minute, whereas with the Wing Simulator it takes less than a second — or two orders of magnitudes less. Moreover, with Wing, you can write tests without mocking the cloud and run the same ones on the simulator and in the cloud.

You can get a first-hand sense of it in the Wing Playground.

Conclusion

Although Wing introduces significant improvements in cloud development, we understand that migrating to a new language is a substantial undertaking that may be hard to justify in many cases.

We’ve gone to great lengths to make adopting the language as easy as possible with the following features:

Easy to learn because it is similar to other languages.
Works seamlessly with your existing stack and tools (especially deployment and management).
Mature ecosystem — import any NPM module or Terraform resource into your code.
Integrates into existing codebases — write runtime code in other languages and reference it with Wing.

Furthermore, we believe that in the era of AI, adopting a new language like Winglang is easier for humans as AI assists with writing code in unfamiliar languages and frameworks and simplifies the migration of existing code to new languages.

As we move toward a future where AI plays a more significant role in code development, the creation and adoption of languages like Winglang will ensure better collaboration, faster development and higher-quality applications for both human and AI developers.

To get a glimpse of the future and experience writing code in Wing and testing it instantly, you can visit our playground.

The post Winglang: Cloud Development Programming for the AI Era appeared first on The New Stack.

5 Tips to Build Like a Pro with Slack’s Deno SDK and New CLI

Steve Gill — Fri, 09 Jun 2023 16:33:18 +0000

With the recent launch of Slack’s next generation platform, we’ve released two new purpose-built developer tools that make building projects on the Slack platform a breeze. First, we released the new Slack CLI, which makes it incredibly easy to create, update and deploy projects right from the terminal. Next, we released the Deno Slack SDK, which is written in TypeScript and is open sourced and focused on helping developers build new modular apps quickly and securely.

How Do I Harness These New Capabilities?

Slack chose Deno, a relatively new JavaScript runtime, to power its new platform. This is because Deno is compliant with web standards, has a simpler dependency management system, and is secure by default, giving admins peace of mind and developers built-in granular controls, like the ability to execute code with limited access to the file system or external domains.

Deno also has support for TypeScript out of the box, which is our language of choice for developing new modular apps with the Deno Slack SDK. I’m especially a fan of the way Deno standardizes the way modules are imported in both JavaScript and TypeScript using the ECMAScript 6 import/export standard.

Throughout our open beta, we have been gathering feedback from developers using our new tools. These are my top tips to help you get the most out of our tools capabilities when building on the Slack platform:

Tip 1: Use Type Hints

Type hints will save you both time and frustration, so take advantage of the IntelliSense features of your code editor. Because the Deno Slack SDK is written in Typescript, you are provided with advanced type hints that make it easy to learn about what methods, arguments and types are available for projects you are building.

Tip 2: Develop Rapidly in Local Mode

Minimize risk and errors by rapidly iterating on your projects in local mode with the Slack CLI. Local mode (or “slack run”) gives you a way to quickly iterate on your projects and see the changes reflected in your workspace. This is a great way to get your projects functional before deploying them on Slack’s infrastructure.

Tip 3: Unit Test Your Functions

Use our built-in SlackFunctionTester utility to help write unit tests for your custom functions. SlackFunctionTester allows you to easily specify the inputs and verify the outputs of a custom function. Once you have written your unit tests, you can use Deno’s built-in test runner to run your tests. Read our docs to learn more about testing.

Tip 4: Take Advantage of Built-in Slack Functions

Slack now offers built-in Slack functions that allow you to do common tasks without having to write custom-coded functions or directly call the Slack API, saving you valuable time. Some examples of Slack functions include opening a form to gather data, creating and archiving channels, updating channel topics and sending messages in Slack. These can be used as steps in your coded workflows and will also be available in the upcoming revamped Workflow Builder.

Tip 5: Harness Sample Apps and Templates

Don’t know where to start? We’ve created many sample apps that are just waiting for you to tweak and make them your own. You can access them in the Slack CLI when running slack create or slack samples to see the entire list, or view them on our docs.

What’s the Future of the Deno Slack SDK and the Slack CLI?

With both the Slack CLI and the Deno Slack SDK, we are continuing to put our foot on the pedal to incorporate feedback and continue improving the developer experience. The Slack CLI will also look at improving its support for CI/CD and how easy it is to script common tasks.

My team is also looking into open sourcing our CLI on GitHub and can’t wait to share the code. With the Deno Slack SDK, we’re focusing on making it even easier to use functions and connectors created by other developers in your coded workflows. And with Deno 2.0 expected to be released this summer, we got a lot to look forward to. Come let us know your experience with the Deno Slack SDK on GitHub.

Slack aims to make work more efficient and productive for all, and creative devs like you are powering that effort. So test out our new platform, start building cool stuff, and let us know what you think!

The post 5 Tips to Build Like a Pro with Slack’s Deno SDK and New CLI appeared first on The New Stack.

Vision Pro for Devs: Easy to Start, but UI Not Revolutionary

Richard MacManus — Fri, 09 Jun 2023 16:09:14 +0000

“Welcome to the era of spatial computing,” announced Apple as it unveiled its latest device, a pair of mixed-reality goggles called the Vision Pro. CEO Tim Cook described it as “a new kind of computer that augments reality by seamlessly blending the real world with the digital world.” A new operating system powers the device, called visionOS — which Apple says contains “the building blocks of spatial computing.”

If it’s “a new type of computer,” as Apple claims, then that means a new greenfield for developers. So what can devs expect from visionOS and Vision Pro? I watched a WWDC session entitled “Get started with building apps for spatial computing to find out.

“By default, apps launch into the Shared Space,” began Apple’s Jim Tilander, an engineer on the RealityKit team. “This is where apps exist side-by-side, much like multiple apps on a Mac desktop. People remain connected to their surroundings through passthrough.” (Passthrough in this case means to switch attention from the virtual world to the physical world, or vice versa.)

He then introduced three new concepts, all of them SwiftUI scenes: Windows, Volumes, and Spaces. SwiftUI has been around for four years, serving as Apple’s primary user interface framework across its various products. For visionOS, SwiftUI has been bolstered with “all-new 3D capabilities and support for depth, gestures, effects, and immersive scene types.”

Each of the three scenes is self-explanatory, but it’s worth noting that in addition to the “Shared Space” concept, Apple also has “Full Space,” which is when you want “a more immersive experience” for an application and so “only that app’s content will appear.”

It’s interesting to note that Apple appears to have a different definition of “presence” than Meta (née Facebook). Meta defines presence as “high fidelity digital representations of people that create a realistic sense of connection in the virtual world.” In other words, “presence” to Meta means being fully immersed in the virtual world. But based on the following graphic I saw in this session, “presence” to Apple means less immersion — it’s letting the physical world enter the view of your Vision Pro goggles.

Privacy Pros and Cons

Apple claims that the Vision Pro and visionOS platform treat user privacy as a core principle, while also “making it easy for you as a developer to leverage APIs to take advantage of the many capabilities of the device.”

Apple’s solution to preserving user privacy is to curate data and interactions for developers. Tilander gave two interesting examples of this.

“Instead of allowing apps to access data from the sensors directly, the system does that for you and provides apps with events and visual cues. For example, the system knows the eye position and gestures of somebody’s hands in 3D space and delivers that as touch events. Also, the system will render a hover effect on a view when it is the focus of attention, but does not communicate to the app where the person is looking.”

Sometimes “curated” data won’t be enough for developers. Tilander explained that “in those cases where you actually do need access to more sensitive information, the system will ask the people for their permission first.”

Given how potentially invasive the Vision Pro is to peoples’ privacy — including the user, since it has eye-scanning capabilities for login and tracking — the restrictions Apple has imposed on developers sound reasonable.

However, Google developer Brandon Jones pointed out on Twitter that “if you want to do AR apps, you must give Apple full rendering control.” While generally, he thinks this is a good thing — “You don’t want, for example, ads to be able to infer how much time a user spent looking at them” — he isn’t so excited about Apple “quietly re-inventing and side-stepping web standards in order to achieve that.”

In a nutshell, Apple’s privacy restrictions for Vision Pro are implemented at the OS level, giving Apple a great deal of control. Jones admitted that most developers will be comfortable with that, but he correctly noted that “Apple (already notorious for clamping down on what you can do with iOS) is doubling down on restricting the ways you can diverge from their chosen patterns.”

The Tools

“Everything starts with Xcode,” Tilander said, regarding how developers will build apps for visionOS. Xcode is Apple’s integrated development environment (IDE) and it comes with a simulator for Vision Pro and an enhanced “Instruments” performance analysis tool (which includes a new template, RealityKit Trace).

The frameworks to build 3D content are ARKit and RealityKit, which handle tracking, rendering, physics, animations, spatial audio, and more.

For visionOS, Apple is introducing a new editor called Reality Composer Pro, which “allows you to preview and prepare 3D content for your apps.” A Reddit user described it as “like Powerpoint in AR,” so the emphasis is on ease of use.

No doubt realizing that it needed more than just existing Apple developers to start thinking about developing for Vision Pro, Apple has also partnered with Unity, an existing 3D platform. In the WWDC 23 opening keynote, one of the presenters noted that “popular Unity-based games and apps can gain full access to visionOS features, such as passthrough, high-resolution rendering, and native gestures.” Tilander confirmed in his session that no Unity plug-ins would be required, and that developers can simply “bring your existing content over.”

How to Get Started

To begin a new app, in Xcode you can choose the default app template for “xrOS” (apparently the shortened version of visionOS). From there, you select a “scene type,” with the default being “Window.” This is in a Shared Space by default, but you can change that.

“And when you finish the assistant,” continued Tilander, “you are presented with an initial working app in SwiftUI that shows familiar buttons mixed in with a 3D object rendered with RealityKit.”

You can also easily convert iPhone or iPad apps into visionOS apps, noted Tilander.

Developers can expect more resources, including a developer kit, in July. An initial visionOS SDK will be available in Xcode by the end of this month.

Apple Keen for Devs to Jump Into 3D

As usual when Apple announces a new device, a lot of thought has been put into the developer tools and techniques for it. There’s nothing in visionOS that looks out of reach for existing iOS developers, so it’s a fairly seamless transition for Apple’s developer community.

Of course, the drawback is that Apple is enticing developers into yet another closed developer ecosystem. visionOS will have its own App Store, we were told at WWDC 23, but you can guarantee it won’t be any more open than the iOS App Store.

The final thing to note for developers is that the user interface really isn’t that different from iPhone, at least for the first-generation Vision Pro. “They’re still just rectangles on the internet,” as one Twitter user put it. As others have pointed out, this is probably because Apple wants to make it easy for its existing developers to start building on visionOS. Now, from a user point of view, early reports suggest that Vision Pro may indeed be magical. But from a developer point of view, Vision Pro isn’t that revolutionary — yet.

The post Vision Pro for Devs: Easy to Start, but UI Not Revolutionary appeared first on The New Stack.

GitLab All in on AI: CEO Predicts Increased Demand for Coders

Loraine Lawson — Fri, 09 Jun 2023 15:26:19 +0000

GitLab is all in on AI, with CEO and co-founder Sid Sijbrandij calling it “one of the most exciting technology developments of our time” and making an unusual prediction that it will create demand for more programmers.

“AI represents a major shift for our industry. It fundamentally changes the way that software is developed,” Sijbrandij said on GitLab’s earnings call Monday. “We believe it will accelerate our ability to help organizations make software, faster. I am excited about this next wave of technology innovation.”

GitLab plans to incorporate AI at all levels of its DevSecOps platform, he added.

“We believe that an AI-powered platform focused solely on the Developer persona is incomplete. It is missing essential Security, Operations, and Enterprise functionality,” Sijbrandij said. “Remember: developers spend only a small fraction of their time developing code. The real promise of AI extends far beyond code creation.”

During the first quarter of the year, GitLab delivered five new AI features, followed by five more in May with the release of GitLab 16 — including a beta of Code Suggestion, as well as security testing and analysis, observability and proactive vulnerability detection. Additional AI-powered features available include Suggested Reviewers for code review, Explain This Vulnerability for vulnerability remediation, and Value Stream Forecasting for predicting future team efficiency. Code Suggestions does what its name implies, making code suggestions to developers as they type.

“We’re proud to have 10 AI features available to customers today, almost three times more than the competition,” he said, adding that applying AI to a single data store, for the full software development life cycle, also creates compelling business outcomes and is something he believes can be done with GitLab.

GitLab continues to iterate on Code Suggestions and expects to make it generally available later this year. The company has also boosted language support from six languages to 13, so more developers can use it, he added.

“Code Suggestions is uniquely built with privacy first as a critical foundation,” he said. “Our customers’ proprietary source code never leaves GitLab’s cloud infrastructure. This means that their source code stays secure. In addition, model output is not stored and not used as training data.”

AI Support for Development Teams

Also later this year, the company plans to introduce an AI add-on focused on supporting development teams, which will include Code Suggestions functionality, across all GitLab’s tiers at an anticipated price point of $9 per user per month, billed annually, he said.

He noted that they’d had many conversations with senior-level customers, but one comment from the CTO of a top European bank stood out.

“When the conversation moved into AI, the CTO said something extremely interesting. He said: Code generation is only one aspect of the development cycle. If we only optimize code generation, everything else downstream from the development team — including QA, security, and operations — breaks. It breaks because these other teams involved in software development can’t keep up,” Sijbrandij said. “This point — incorporating AI throughout the software development lifecycle — is at the core of our AI strategy.”

Companies Reevaluating Strategies in the Wake of AI

Customers are also reevaluating their own software supply chain through the AI lens, he added. Additionally, chief information security officers are also engaging with AI, and applying governance, security, compliance and audit-ability to it.

He predicted that AI will increase GitLab’s market for three reasons. First, AI will make writing software easier, which in turn will expand the audience of people — including junior and citizen developers — who build software. Second, as developers become more productive, software will become less expensive to create, which will fuel demand for more software and require more developers to meet the additional need. Third, the company expects more customers will turn to its solutions as they build machine learning models and AI into their applications.

“As we add ModelOps capabilities to our DevSecOps platform, this will invite data science teams as new personas, and will allow these teams to work alongside their Dev, Sec, and Ops counterparts,” he said. “We see ModelOps as a big opportunity for GitLab.”

Sijbrandij also shared how global security and aerospace company Lockheed Martin used GitLab to reduce their toolchain and complexity while reducing costs. The Lockheed Martin team has reported 80x faster CI pipeline builds and 90% less time spent on system maintenance, he said, adding they’ve also retired thousands of Jenkins servers. They’ve also moved from monthly or weekly deliveries to daily or multiple daily deliveries.

The post GitLab All in on AI: CEO Predicts Increased Demand for Coders appeared first on The New Stack.

Unlocking DevSecOps’ Potential Challenges, Successes, Future

Steven J. Vaughan-Nichols — Fri, 09 Jun 2023 14:40:20 +0000

It has been more than 15 years since DevOps emerged on the technology landscape, promising to revolutionize team collaboration and streamline development processes. While some people now say Platform Engineering is the one true way forward, DevOps scope widened to include security, giving rise to DevSecOps, which remains influential. Unfortunately, even as the need for coding and operational security grows, a Progress Software study has found that many organizations have struggled to implement DevSecOps.

To find out why, Progress interviewed 606 IT/Security/App Dev and DevOps decision-makers from organizations with over 500 employees across 11 countries. The survey’s goals were to identify what was hindering DevSecOps success and to uncover best practices from companies with thriving DevSecOps programs.

The Challenges

They found:

DevSecOps success has been hindered by complexity and constant change.
Effective DevSecOps requires collaboration and investment in culture.
The desire to succeed in DevSecOps did not guarantee mastery of its practices.

These DevSecOps challenges included complexity, competing priorities, and a lack of clear business impact and Return on Investment (ROI). Additionally, while the participants recognized the potential benefits of adopting cloud native technology, AI, and Policy as Code in their DevSecOps strategy, they had trouble demonstrating the ROI for these investments. That, of course, made it difficult to secure buy-in from stakeholders.

In addition, despite security threats being the primary driver for DevOpsSec evolution, many respondents proved only somewhat familiar with how security fits in DevSecOps. In short, they didn’t really understand the techniques they were trying to use. Specifically, they had trouble prioritizing security efforts, securing different types of workloads, and meeting delivery deadlines and audit requirements.

While everyone agreed that collaboration and culture were critical factors for successfully implementing DevSecOps, only 30% of the respondents felt confident in the level of collaboration between security and development teams. Furthermore, 71% agreed that culture was the biggest barrier to DevSecOps progress, yet only 16% prioritized culture as an area for optimization in the next 12-18 months. This discrepancy underscored the need for fostering a collaborative culture within organizations.

Addressing the Challenges

Therefore, to fully harness the potential of DevSecOps, organizations must address several key challenges. These are:

Overcome obstacles to collaboration: Encourage cross-functional communication and collaboration between security, app development, and other teams.
Incorporate new technologies and processes: Balance modernizing technology, processes, and culture, as focusing on just one area will not be enough.
Address conflicting interests: Ensure leadership prioritizes and invests in key areas that drive DevSecOps success, including adopting a holistic approach that engages teams from across the organization.
Build confidence in securing cloud native adoption: Focus on implementing and leveraging the benefits of cloud-first technologies while considering cloud security.

It’s become clear that even though we’ve been using DevOps for years, many of us still haven’t mastered creating an effective DevSecOps culture. Companies must engage in honest conversations from the executive level down about where they are in their journey and how to move forward to success.

The post Unlocking DevSecOps’ Potential Challenges, Successes, Future appeared first on The New Stack.

Infrastructure as Code: Modernizing for Faster Development

Heather Joslyn — Thu, 08 Jun 2023 19:59:41 +0000

Before Matt Stephenson worked at Starburst Data, he used to work at Square. There, he learned some hard lessons about working with legacy Infrastructure as Code (IaC).

“We built an entire system that kind of did a lot of orchestration with Terraform and Helm, and integrated with some of our own backend services,” Stephenson, a senior principal software engineer at Starburst, told The New Stack.

It’s not a project he remembers fondly: “The experience of having to build and maintain that service made me take a look at what was available out in the industry, for not having to build that again.”

The problem isn’t Terraform per se, he said, but “it’s all of the code to execute Terraform, all the code to manage the inputs and outputs to Terraform itself.”

Legacy IaC can bring a number of challenges into the lives of an engineering team. Among them:

It can get more complex to follow required conventions and standards when configurations are being defined, and the complexity grows as it scales.
As a result, configuration drift is common, and can result in noncompliance and service outages. (And misconfigurations in general are a leading cause of security breaches.)
Necessary integrations and features aren’t always available for specific use cases.
Legacy IaC can bring significant maintenance needs, and it can be tough to recruit and retain engineers who have those skills.

“A lot of the legacy Infrastructure as Code products have their own language, they have their own environment, you have to kind of become a bit of an expert in them to be effective at them,” Stephenson said. “Or you have to have some kind of support, going into using one of those.”

At Starburst Data, he oversees the architecture for the company’s Galaxy product, a managed data lake analytics platform. His team has gradually swapped out legacy IaC for Pulumi, an open source IaC product that allows infrastructure to be built in any programming language.

Stephenson will be among the presenters at PulumiUP, a virtual user conference on June 15 dedicated to Infrastructure as Code, how it enables faster application development and how users can navigate the challenges of legacy systems.

At the conference, he’ll be talking about Pulumi’s automation API, he said. “That was a big driver for us, being able to orchestrate all of our Pulumi stacks without having to write that whole service that we had to write in the past.”

Empowering the Whole Team

One of the differences between Pulumi and legacy IaC solutions, Stephenson said, is that “it’s based in programming languages that people learn in college or learn really quickly when they join the industry.”

Pulumi allows developers to build infrastructure in common languages including any Java language (Java, Scala, Clojure, Groovy, Kotlin); .NET (C#, F#, PowerShell); Node.js (JavaScript, TypeScript); Go, Python and even YAML. This helps make provisioning infrastructure something that more members of an engineering team can do.

Before his experience using Pulumi, Stephenson said, “it was mostly more senior engineers that would be involved in setting up all of your infrastructures, your code environments. These days we have folks across the skill-set level working in it.

Now, he said, even people in his organization without infrastructure or site reliability engineering backgrounds, “when they’re doing product development, they’re able to just go in and make the changes they need. They don’t really have to worry about engaging an expert to be able to get something to happen that they want.”

As a result, Stephenson added, there’s less need for hiring IaC-specific experts for a team, and more people are empowered to handle problems.

“If there’s an incident that involves the infrastructure, a lot of times people can make the changes they need to execute our continuous delivery pipeline and get things fixed.”

A Search for Flexibility

Dennis Sauvé, the first DevOps engineer hired by Washington Trust Bank, will also be presenting at PulumiUP, talking about his company’s experience moving from an entirely on-premises system to one run on Microsoft Azure Cloud — with IaC, largely written in TypeScript, provided by Pulumi.

Before the bank hired Sauvé, it decided to start services to the cloud to move forward with innovations like a customer collaboration tool that will allow Washington Trust’s relationship managers to talk with clients directly. It had determined that Azure’s communications services would help it build that application more easily, Sauvé told The New Stack.

But the bank also wanted flexibility for applications it might build in the future, and for the clouds it might deploy those apps on.

Pulumi, Sauvé said, offered that flexibility and the options his team needed. “You can pick your cloud provider. And then once you have a cloud provider, you can pick a language you want to build that stack in, and they support it.

“And so we had that peace of mind that not only if we wanted to change the language we wrote our Infrastructure as Code with, we could also change our cloud provider as well. We could go to [Amazon Web Services] or Google Cloud, and we’d be able to take plenty right along with us. So that was a huge bonus when considering different providers.”

Saving Time and Toil

One of the biggest benefits of Pulumi for Washington Trust Bank, Sauvé said, has been the ability it gives his team to save time and toil. He and his development team have been creating best-practice templates for creating resources.

Instead of the back-and-forth that might have existed between developers and operations engineers, “the developers can now just go to our infrastructure package, find the resource that they want to build, choose that and set it up to deploy. And it really speeds up development and testing environments.”

Not only that, he added, but Pulumi has become a standardization tool, ensuring that resources are being created in the same way across the organization.

However, he added, moving to the cloud and onto Pulumi, hasn’t been without hiccups. Notably, the native Typescript package is, “from a file-size standpoint, just a massive package that is a little taxing on resources to use, but it works in production.”

Pulumi, he noted, will soon release a next-generation version of the TypeScript package that “should be very slimmed down and address some of the performance issues.”

Shifting away from legacy IaC can cause a bit of disruption on a team at first, Stephenson acknowledged. (“There’s always folks that really kind of hang their hat on being the expert in the room with specific things like Terraform,” he noted.)

But in the long run, he said, it empowers a broader set of people in the organization. He pointed to a colleague who joined Starburst Data soon after graduating from college: “Now he’s at a senior level; he’s basically gotten himself bumped a level twice, because he’s just so on top of everything. Pulumi was one of those things that he really dug into.”

Stephenson has heard similar stories from other companies. “You end up with people who might push back, but then at the end of the day, there’s a lot of people who excel and become the next rock stars as a result of making a shift like this.”

To learn more about Infrastructure as Code (and see presentations by Stephenson, Sauvé and other experts), register for June 15’s virtual event, PulumiUP.

The post Infrastructure as Code: Modernizing for Faster Development appeared first on The New Stack.

How LLMs Are Transforming Enterprise Applications

Ed Anuff — Thu, 08 Jun 2023 14:12:10 +0000

Artificial intelligence is the most transformative paradigm shift since the internet took hold in 1994. And it’s got a lot of corporations, understandably, scrambling to infuse AI into the way they do business.

One of the most important ways this is happening is via generative AI and large language models (LLMs), and it’s far beyond asking ChatGPT to write a post about a particular topic for a corporate blog or even to help write code. In fact, LLMs are rapidly becoming an integral part of the application stack.

Building generative AI interfaces like ChatGPT — “agents” — atop a database that contains all the data necessary and that can “speak the language” of LLMs is the future (and, increasingly, the present) of mobile apps. The level of dynamic interaction, access to vast amounts of public and proprietary data, and ability to adapt to specific situations make applications built on LLMs powerful and engaging in a way that’s not been available until recently.

And the technology has quickly evolved to the extent that virtually anyone with the right database and the right APIs can build these experiences. Let’s look at what’s involved.

Generative AI Revolutionizes the Way Applications Work

When some people hear “agent” and “AI” in the same sentence, they think about the simple chatbot that appears as a pop-up window that asks how it can help when they visit an e-commerce site. But LLMs can do much more than respond with simple conversational prompts and answers pulled from an FAQ. When they have access to the right data, applications built on LLMs can drive far more advanced ways to interact with us that deliver expertly curated information that is more useful, specific, rich — and often uncannily prescient.

Here’s an example:

You want to build a deck in your backyard, so you open your home-improvement store’s mobile application and ask it to build you a shopping list. Because the application is connected to an LLM like GPT-4 and many data sources (the company’s own product catalog, store inventory, customer information and order history, along with a host of other data sources), it can easily tell you what you’ll need to complete your DIY project. But it can do much more.

If you describe the dimensions and features you want to include in your deck, the application can offer visualization tools and design aids. Because it knows your postal ZIP code, it can tell you which stores within your vicinity have the items you need in stock. It can also, based on the data in your purchase history, suggest that you might need a contractor to help you with the job — and provide contact information for professionals near you.

The application could also tell you the amount of time it will take deck stain to dry (even including the seasonal climate trends for where you live) and how long it’ll be until you can actually have that birthday party on your deck that you’ve been planning. The application could also assist with and provide information on a host of other related areas, including details on project permit requirements and the effect of the construction on your property value. Have more questions? The application can help you at every step of the way as a helpful assistant that gets you where you want to go.

Using LLMs in Your Application Is Hard, Right?

This isn’t science fiction. Many organizations, including some of the largest DataStax customers, are working on many projects that incorporate generative AI.

But these projects aren’t just the realm of big, established enterprises; they don’t require vast knowledge about machine learning or data science or ML model training. In fact, building LLM-based applications requires little more than a developer who can make a database call and an API call. Building applications that can provide levels of personalized context that were unheard of until recently is a reality that can be realized with anyone who has the right database, a few lines of code and an LLM like GPT-4.

LLMs are very simple to use. They take context (often referred to as a “prompt”) and produce a response. So, building an agent starts with thinking about how to provide the right context to the LLM to get the desired response.

Broadly speaking, this context comes from three places: the user’s question, the predefined prompts created by the agent’s developer and data sourced from a database or other sources (see the diagram below).

A simple diagram of how an LLM gathers context to produce a response.

The context provided by the user is typically simply the question they input into the application. The second piece could be provided by a product manager who worked with a developer to describe the role the agent should play (for example, “you’re a helpful sales agent who is trying to help customers as they plan their projects; please include a list of relevant products in your responses”).

Finally, the third bucket of provided context includes external data pulled from your databases and other data sources that the LLM should use to construct the response. Some agent applications may make several calls to the LLM before outputting the response to the user in order to construct more detailed responses. This is what technologies such as ChatGPT Plug-ins and LangChain facilitate (more on these below).

Giving LLMs Memory

AI agents need a source of knowledge, but that knowledge has to be understandable by an LLM. Let’s take a quick step back and think about how LLMs work. When you ask ChatGPT a question, it has very limited memory or “context window.” If you’re having an extended conversation with ChatGPT, it packs up your previous queries and the corresponding responses and sends that back to the model, but it starts to “forget” the context.

This is why connecting an agent to a database is so important to companies that want to build agent-based applications on top of LLMs. But the database has to store information in a way that an LLM understands: as vectors.

Simply put, vectors enable you to reduce a sentence, concept or image into a set of dimensions. You can take a concept or context, such as a product description, and turn it into several dimensions: a representation of a vector. Recording those dimensions enables vector search: the ability to search on multidimensional concepts, rather than keywords.

This helps LLMs generate more accurate and contextually appropriate responses while also providing a form of long-term memory for the models. In essence, vector search is a vital bridge between LLMs and the vast knowledge bases on which they are trained. Vectors are the “language” of LLMs; vector search is a required capability of databases that provide them with context.

Consequently, a key component of being able to serve LLMs with the appropriate data is a vector database that has the throughput, scalability and reliability to handle the massive datasets required to fuel agent experiences.

… with the Right Database

Scalability and performance are two critical factors to consider when choosing a database for any AI/ML applications. Agents require access to vast amounts of real-time data and require high-speed processing, especially when deploying agents that might be used by every customer who visits your website or uses your mobile application. The ability to scale quickly when needed is paramount to success when it comes to storing data that feeds agent applications.

Apache Cassandra is a database that’s relied on by leaders like Netflix, Uber and FedEx to drive their systems of engagement, and AI has become essential to enriching every interaction that a business serves. As engagement becomes agent-powered, Cassandra becomes essential by providing the horizontal scalability, speed and rock-solid stability that makes it a natural choice for storing the data required to power agent-based applications.

For this reason, the Cassandra community developed the critical vector search capabilities to simplify the task of building AI applications on huge datasets, and DataStax has made these capabilities easily consumable via the cloud in Astra DB, the first petascale NoSQL database that is AI-ready with vector capabilities (read more about this news here).

How It Is Being Done

There are a few routes for organizations to create agent application experiences, as we alluded to earlier. You’ll hear developers talk about frameworks like LangChain, which as the name implies, enables the development of LLM-powered agents by chaining together the inputs and outputs of multiple LLM invocations and automatically pulling in the right data from the right data sources as needed.

But the most important way to move forward with building these kinds of experiences is to tap into the most popular agent on the globe right now: ChatGPT.

ChatGPT plugins enable third-party organizations to connect to ChatGPT with add-ons that serve up information on those companies. Think about Facebook. It became the social network platform, with a huge ecosystem of organizations building games, content and news feeds that could plug into it. ChatGPT has become that kind of platform: a “super agent.”

Your developers might be working on building your organization’s own proprietary agent-based application experience using a framework like LangChain, but focusing solely on that will come with a huge opportunity cost. If they aren’t working on a ChatGPT plugin, your organization will miss out on a massive distribution opportunity to integrate context that is specific to your business into the range of possible information ChatGPT can supply or actions it can recommend to its users.

A range of companies, including Instacart, Expedia, OpenTable and Slack have built ChatGPT plugins already; think about the competitive edge their integration with ChatGPT might create.

An Accessible Agent for Change

Building ChatGPT plug-ins will be a critical part of the AI-agent projects that businesses will look to engage in. Having the right data architecture — in particular, a vector database — makes it substantially easier to build very high-performance agent experiences that can quickly retrieve the right information to power those responses.

All applications will become AI applications. The rise of LLMs and capabilities like ChatGPT plugins is making this future much more accessible.

Want to learn more about vector search in Cassandra? Register for the June 15 webinar.

The post How LLMs Are Transforming Enterprise Applications appeared first on The New Stack.

Sundeck Launches Query Engineering Platform for Snowflake

Andrew Brust — Thu, 08 Jun 2023 12:00:06 +0000

Sundeck, a new company led by one of the co-founders of Dremio, recently launched a public preview of its eponymous SaaS “query engineering” platform. The platform, which Sundeck says is built for data engineers, analysts and database administrators (DBAs), will initially work with Snowflake‘s cloud data platform. Sundeck will be available free of charge during the public preview; afterward, the company says it will offer “simple” pricing, including both free and premium tiers.

Sundeck (the product) is built atop an Apache-licensed open source project called Substrait, though it offers much additional functionality and value. Sundeck (the company) has already closed a $20M seed funding round, with participation from venture capital firms Coatue, NEA and Factory.

What Does It Do?

Jacques Nadeau, formerly CTO at Dremio and one of its co-founders, briefed the New Stack and explained in depth how Sundeck query engineering works. Nadeau also described a number of Sundeck’s practical applications.

Basically, Sundeck sits between business intelligence (BI)/query tools on the one hand, and data sources (again, just Snowflake, to start) on the other. It hooks into the queries and can dynamically rewrite them. It can also hook into and rewrite query results.

Sundeck “hooks” (bottom, center) insinuate themselves in the query path between data tools (on the left) and the data source (on the right). Credit: Sundeck

One immediate benefit of the query hook approach is that it lets customers optimize the queries with better SQL than the tools might generate. By inspecting queries and looking for specific patterns, Sundeck can find inefficiencies and optimize them on-the-fly, without requiring users, or indeed BI tools, to do so themselves.

Beyond Query Optimization

More generally, though, Sundeck lets customers evaluate rules and take actions. The rules can be based on the database table(s) being queried, the user persona submitting the query or even properties of the underlying system being queried. This lets Sundeck do anything from imposing usage quotas (and thus controlling Snowflake spend); to redirecting queries to different tables or a different data warehouse; rejecting certain high-cost queries outright; reducing or reshaping a result set; or kicking off arbitrary processes.

In effect, Sundeck takes the call-and-response pipeline between client and database and turns it into an event-driven service platform, with a limitless array of triggers and automated outcomes. But that’s not to say Sundeck does this in some generic compute platform-like fashion. Instead, it’s completely contextual to databases, using Snowflake’s native API.

Must read:

With that in mind, we could imagine other applications for Sundeck, including observability/telemetry analytics, sophisticated data replication schemes and even training of machine learning models, using queries and/or result sets as training or inferencing data. Data regulation compliance, data exfiltration prevention, and responsible AI processes are other interesting applications for Sundeck. Apropos of that, Sundeck says its private result path technology ensures data privacy and that its platform is already SOC 2-certified.

In the Weeds

If all of this sounds a bit geeky, that would genuinely seem to be by design. Sundeck’s purpose here was to provide a user base — that already works at a technical level — access to the query pipeline, which heretofore has largely been a black box. This user audience is already authoring sophisticated data transformation pipelines with platforms like dbt, so why not let them transform queries as well?

It’s no surprise that Sundeck is a product that lives deep in the technology stack. After all, Nadeau previously led similarly infrastructural open source projects like Apache Arrow, which provides a unified standard for storing columnar data in memory (and which Nadeau says is an important building block in Snowflake’s platform), and Apache Drill, which acts as a SQL federated query broker. The rest of the fifteen-person Sundeck team has bona fides similar to Nadeau’s, counting 10 Apache project management committee (PMC) leaders, and even co-founders of Apache projects, like Calcite and Phoenix, among its ranks.

Check out:

Sunny Forecast on Deck?

If data is the lifeblood of business, then query pathways are critical arteries in a business’ operation. As such, being able to observe and intercept queries, then optimize them or automate processes in response to them, seems like common sense. If Sundeck can expand to support the full array of major cloud data warehouse and lakehouse platforms, query engineering could catch on and an ecosystem could emerge.

The post Sundeck Launches Query Engineering Platform for Snowflake appeared first on The New Stack.

Chainguard Unveils Speranza: A Novel Software Signing System

Steven J. Vaughan-Nichols — Wed, 07 Jun 2023 21:22:23 +0000

Chainguard Labs, the company behind Sigstore code signing, in collaboration with MIT CSAIL and Purdue University researchers, has unveiled a new preprint titled “Speranza: Usable, privacy-friendly software signing.” It describes how to balance usability and privacy in software signing techniques. The result, they hope, will augment software supply chain security.

Why is this a problem? Because as it is, there’s no guarantee that the person signing the code is actually the authorized author. “Digital signatures can bolster authenticity and authorization confidence, but long-lived cryptographic keys have well-known usability issues, and Pretty Good Privacy (PGP) signatures on PyPI [Python Package Index], for example, are “worse than useless.”

That’s because PGP isn’t really a standard. In PyPI‘s case, “many are generated from weak keys or malformed certificates.” In short, there is no ultimate source of trust for PGP signatures. We need more.

In addition, the Sigstore’s keyless signing flow exposes your email with every signature. For a variety of reasons, not everyone wants their name attached to a signed code artifact.

Where Speranza Comes in

That’s where the Speranza project comes in. It takes a novel approach to this problem, by proposing a solution that maintains signer anonymity while verifying software package authenticity.

It does this by incorporating zero-knowledge identity co-commitments. Zero-knowledge is a blockchain cryptographic approach. It reveals that a piece of hidden information is valid and known by the prover with a high degree of certainty.

In the Speranza model, a signer uses an automated certificate authority (CA) to generate a new pseudonym each time. These pseudonyms are manifested as Pedersen commitments. While these are cryptographically linked to plaintext email identities, they don’t reveal any further identity information. Using a zero-knowledge approach, they still, however, assure you that there’s a real, legitimate user behind the code’s signature.

Adding another layer to the process, the signer uses the zero-knowledge commitment equality proofs to show that two commitments denote the same value. Thus, you can be sure a pair of signatures originated from the same identity, all while keeping that identity and any links between other signatures concealed.

Got all that? I hope so because that’s as clear as I can make it.

Sperenza Can Be Trusted

Pragmatically speaking, Speranza-based signatures can be trusted. That is not the case with many current signature approaches. Or, for example, the npm package provenance sidesteps this issue by using machine identities, but this doesn’t help with author signature privacy issues.

The proposed Speranza approach also requires a package repository to maintain the mapping from package names to commitments to the identities of authorized signers. This allows the signer to create proof that the identity from the map and the identity for their latest signature are co-commitments. The project also employs techniques from key transparency to alleviate the necessity for users to download the entire package ownership map.

The Speranza Rust proof-of-concept implementation shows that the overheads for maintainers (signing) and end users (verifying) are minimal, even for repositories hosting millions of packages. The system dramatically reduces data requirements, and server costs are negligible.

In conclusion, Speranza represents a feasible, practical solution that has the potential to operate on the scale of the largest software repositories currently in existence. By successfully marrying robust verification with crucial privacy measures, it aims to enable deployment on real package repositories and in enterprise settings.

Read the paper, give the code a try, and let the Speranza let you know what you think. This is still a work in progress.

The post Chainguard Unveils Speranza: A Novel Software Signing System appeared first on The New Stack.

Speeding up Codecov Analysis for Xcode Projects

Isaac Halvorson — Wed, 07 Jun 2023 16:02:54 +0000

Tandem Diabetes Care develops and manufactures insulin pumps for use by people with diabetes. These pumps deliver a precise amount of insulin directly into the body and are one of the few on the market that can be controlled by a mobile app. Our customers need our products and services to be reliable and performant at all times. We have little margin for error and need to ship the highest quality code we can.

As a member of the mobile infrastructure team, I focus on developer automation and tooling of Tandem’s mobile apps. One of the newer tools we’ve been using at Tandem is Codecov. We love that it can track and report on our code coverage in total and on pull requests.

Codecov helps us understand which parts of our codebase are being tested by our unit tests and also helps us keep track of how the percentage of our codebase being tested changes over time. For example, if a pull request would introduce a greater than 1% change to our total code coverage, merging it will be blocked until that is addressed.

The Problem

One of my focuses recently was finding ways to optimize our CI workflows. I noticed that for our iOS app, the Codecov step of our workflow was taking much longer than expected, so I decided to find out why and if there was anything we could do to improve it.

Apple’s integrated development environment Xcode collects code coverage data and can display it in the IDE for developers. It can also be exported as an .xcresult file when using xcodebuild from the command line, like so:

xcrun xcodebuild test \
	-project tconnect.xcodeproj \
	-scheme tconnect \
	-testPlan tconnect \
	-destination "platform=iOS Simulator,name=iPhone 14" \
	-derivedDataPath DerivedData \
	-resultBundlePath artifacts/ResultBundle.xcresult

These xcresult files are great and can be useful in lots of different ways, but like many things Apple, they can be difficult to use outside the Apple ecosystem. While they can be converted into a JSON representation using the xccov binary included with Xcode, the resulting JSON is not in the standard coverage formats that Codecov can ingest. It also has been known to change without warning with new Xcode releases.

And so, in order for Codecov to use the coverage results from Xcode, they have to be converted into another format. Codecov’s official GitHub Action can do the conversion, but it handles this conversion by analyzing the coverage for each file one by one, which can take up to a second for each file. This is a fine enough approach for some projects, but when working with a large codebase like ours, that can take quite some time.

Enter xcresultparser

The open source Swift tool xcresultparser can parse xcresult files and convert them into various other formats. One of these formats is Cobertura XML, which Codecov supports.

The big advantage xcresultparser brings is that, because it is a compiled program and not a script, it can use multiple threads to do the conversion. This speeds up the conversion process immensely.

After running the xcodebuild command above to generate the xcresult file, we tell xcresultparser to convert it like so:

xcresultparser \
	--output-format cobertura \
	"artifacts/ResultBundle.xcresult" >"artifacts/coverage.xml"

And finally, we tell the Codecov GitHub Action to upload that XML file instead of the xcresult file.

Results

So, just how much time savings are we seeing?

We run these builds in parallel, so the total real-time savings for each build is the delta of the longer packages unit tests build: about seven minutes. This might not seem like much, but when you factor in that we’re running these builds upwards of 20 times a day, it’s a considerable time (and cost) savings. That’s over two hours of total developer time saved per day, almost 12 hours per week!

Bonus

While implementing xcresultparser for our project, I learned that it can also print a summary of test results to the command line. For our packages unit test build, we run eight separate test suites in serial, so if a test fails further up the log, it can be difficult to find it.

Now, at the end of each test run, we print out a summary like so:

xcresultparser \
	--output-format cli \
	--failed-tests-only \
	"artifacts/ResultBundle.xcresult"

This produces output that looks like this:

Now, it’s very easy for our developers to see which specific tests had failures just by looking at the end of the test log in CI.

Conclusion

Using xcresultparser has improved the lives of our developers quite a bit. And the fact that it is open source means that we as a developer community can help improve it for the benefit of ourselves and others.

The post Speeding up Codecov Analysis for Xcode Projects appeared first on The New Stack.