You are here

Feed aggregator

SQL: The Universal Solvent for REST APIs

O'Reilly Radar - Tue, 2022/07/19 - 04:16

Data scientists working in Python or R typically acquire data by way of REST APIs. Both environments provide libraries that help you make HTTP calls to REST endpoints, then transform JSON responses into dataframes. But that’s never as simple as we’d like. When you’re reading a lot of data from a REST API, you need to do it a page at a time, but pagination works differently from one API to the next. So does unpacking the resulting JSON structures. HTTP and JSON are low-level standards, and REST is a loosely-defined framework, but nothing guarantees absolute simplicity, never mind consistency across APIs.

What if there were a way of reading from APIs that abstracted all the low-level grunt work and worked the same way everywhere? Good news! That is exactly what Steampipe does. It’s a tool that translates REST API calls directly into SQL tables. Here are three examples of questions that you can ask and answer using Steampipe.

1. Twitter: What are recent tweets that mention PySpark?

Here’s a SQL query to ask that question:

select id, text from twitter_search_recent where query = 'pyspark' order by created_at desc limit 5;

Here’s the answer:

+---------------------+------------------------------------------------------------------------------------------------> | id | text > +---------------------+------------------------------------------------------------------------------------------------> | 1526351943249154050 | @dump Tenho trabalhando bastante com Spark, mas especificamente o PySpark. Vale a pena usar um > | 1526336147856687105 | RT @MitchellvRijkom: PySpark Tip ⚡ > | | > | | When to use what StorageLevel for Cache / Persist? > | | > | | StorageLevel decides how and where data should be s… > | 1526322757880848385 | Solve challenges and exceed expectations with a career as a AWS Pyspark Engineer.> | 1526318637485010944 | RT @JosMiguelMoya1: #pyspark #spark #BigData curso completo de Python y Spark con PySpark > | | > | | > | 1526318107228524545 | RT @money_personal: PySpark & AWS: Master Big Data With PySpark and AWS > | | #ApacheSpark #AWSDatabases #BigData #PySpark #100DaysofCode > | | -> http… > +---------------------+------------------------------------------------------------------------------------------------>

The table that’s being queried here, twitter_search_recent, receives the output from Twitter’s /2/tweets/search/recent endpoint and formulates it as a table with these columns. You don’t have to make an HTTP call to that API endpoint or unpack the results, you just write a SQL query that refers to the documented columns. One of those columns, query, is special: it encapsulates Twitter’s query syntax. Here, we are just looking for tweets that match PySpark but we could as easily refine the query by pinning it to specific users, URLs, types (is:retweet, is:reply), properties (has:mentions, has_media), etc. That query syntax is the same no matter how you’re accessing the API: from Python, from R, or from Steampipe. It’s plenty to think about, and all you should really need to know when crafting queries to mine Twitter data.

2. GitHub: What are repositories that mention PySpark?

Here’s a SQL query to ask that question:

select name, owner_login, stargazers_count from github_search_repository where query = 'pyspark' order by stargazers_count desc limit 10;

Here’s the answer:

+----------------------+-------------------+------------------+ | name | owner_login | stargazers_count | +----------------------+-------------------+------------------+ | SynapseML | microsoft | 3297 | | spark-nlp | JohnSnowLabs | 2725 | | incubator-linkis | apache | 2524 | | ibis | ibis-project | 1805 | | spark-py-notebooks | jadianes | 1455 | | petastorm | uber | 1423 | | awesome-spark | awesome-spark | 1314 | | sparkit-learn | lensacom | 1124 | | sparkmagic | jupyter-incubator | 1121 | | data-algorithms-book | mahmoudparsian | 1001 | +----------------------+-------------------+------------------+

This looks very similar to the first example! In this case, the table that’s being queried, github_search_repository, receives the output from GitHub’s /search/repositories endpoint and formulates it as a table with these columns.

In both cases the Steampipe documentation not only shows you the schemas that govern the mapped tables, it also gives examples (TwitterGitHub) of SQL queries that use the tables in various ways.

Note that these are just two of many available tables. The Twitter API is mapped to 7 tables, and the GitHub API is mapped to 41 tables.

3. Twitter + GitHub: What have owners of PySpark-related repositories tweeted lately?

To answer this question we need to consult two different APIs, then join their results. That’s even harder to do, in a consistent way, when you’re reasoning over REST payloads in Python or R. But this is the kind of thing SQL was born to do. Here’s one way to ask the question in SQL.

-- find pyspark repos with github_repos as ( select name, owner_login, stargazers_count from github_search_repository where query = 'pyspark' and name ~ 'pyspark' order by stargazers_count desc limit 50 ), -- find twitter handles of repo owners github_users as ( select u.login, u.twitter_username from github_user u join github_repos r on r.owner_login = u.login where u.twitter_username is not null ), -- find corresponding twitter users select id from twitter_user t join github_users g on t.username = g.twitter_username ) -- find tweets from those users select>>'username' as twitter_user, '' || (>>'username') || '/status/' || as url, t.text from twitter_user_tweet t join twitter_userids u on t.user_id = where t.created_at > now()::date - interval '1 week' order by limit 5

Here is the answer:

+----------------+---------------------------------------------------------------+-------------------------------------> | twitter_user | url | text > +----------------+---------------------------------------------------------------+-------------------------------------> | idealoTech | | Are you able to find creative soluti> | | | > | | | Join our @codility Order #API Challe> | | | > | | | #idealolife #codility #php > | idealoTech | | Our #ProductDiscovery team at idealo> | | | > | | | Think you can solve it? 😎 > | | | ➡ https://t>/ | ioannides_alex | | RT @scikit_learn: scikit-learn 1.1 i> | | | What's new? You can check the releas> | | | > | | | pip install -U… > | andfanilo | | @edelynn_belle Thanks! Sometimes it > | andfanilo | | @juliafmorgado Good luck on the reco> | | | > | | | My advice: power through it + a dead> | | | > | | | I hated my first few short videos bu> | | | > | | | Looking forward to the video 🙂

When APIs frictionlessly become tables, you can devote your full attention to reasoning over the abstractions represented by those APIs. Larry Wall, the creator of Perl, famously said: “Easy things should be easy, hard things should be possible.” The first two examples are things that should be, and are, easy: each is just 10 lines of simple, straight-ahead SQL that requires no wizardry at all.

The third example is a harder thing. It would be hard in any programming language. But SQL makes it possible in several nice ways. The solution is made of concise stanzas (CTEs, Common Table Expressions) that form a pipeline. Each phase of the pipeline handles one clearly-defined piece of the problem. You can validate the output of each phase before proceeding to the next. And you can do all this with the most mature and widely-used grammar for selection, filtering, and recombination of data.

Do I have to use SQL?

No! If you like the idea of mapping APIs to tables, but you would rather reason over those tables in Python or R dataframes, then Steampipe can oblige. Under the covers it’s Postgres, enhanced with foreign data wrappers that handle the API-to-table transformation. Anything that can connect to Postgres can connect to Steampipe, including SQL drivers like Python’s psycopg2 and R’s RPostgres as well as business-intelligence tools like Metabase, Tableau, and PowerBI. So you can use Steampipe to frictionlessly consume APIs into dataframes, then reason over the data in Python or R.

But if you haven’t used SQL in this way before, it’s worth a look. Consider this comparison of SQL to Pandas from How to rewrite your SQL queries in Pandas.

SQLPandasselect * from airportsairportsselect * from airports limit 3airports.head(3)select id from airports where ident = ‘KLAX’airports[airports.ident == ‘KLAX’].idselect distinct type from airportairports.type.unique()select * from airports where iso_region = ‘US-CA’ and type = ‘seaplane_base’airports[(airports.iso_region == ‘US-CA’) & (airports.type == ‘seaplane_base’)]select ident, name, municipality from airports where iso_region = ‘US-CA’ and type = ‘large_airport’airports[(airports.iso_region == ‘US-CA’) & (airports.type == ‘large_airport’)][[‘ident’, ‘name’, ‘municipality’]]

We can argue the merits of one style versus the other, but there’s no question that SQL is the most universal and widely-implemented way to express these operations on data. So no, you don’t have to use SQL to its fullest potential in order to benefit from Steampipe. But you might find that you want to.

Categories: Technology

Artificial Creativity?

O'Reilly Radar - Tue, 2022/07/12 - 06:24

There’s a puzzling disconnect in the many articles I read about DALL-E 2, Imagen, and the other increasingly powerful tools I see for generating images from textual descriptions. It’s common to read articles that talk about AI having creativity–but I don’t think that’s the case at all.  As with the discussion of sentience, authors are being misled by a very human will to believe. And in being misled, they’re missing out on what’s important.

It’s impressive to see AI-generated pictures of an astronaut riding a horse, or a dog riding a bike in Times Square. But where’s the creativity?  Is it in the prompt or in the product?  I couldn’t draw a picture of a dog riding a bike; I’m not that good an artist. Given a few pictures of dogs, Times Square, and whatnot, I could probably photoshop my way into something passable, but not very good.  (To be clear: these AI systems are not automating photoshop.) So the AI is doing something that many, perhaps most humans, wouldn’t be able to do. That’s important. Very few humans (if any) can play Go at the level of AlphaGo. We’re getting used to being second-best.

However, a computer replacing a human’s limited photoshop skills isn’t creativity. It took a human to say “create a picture of a dog riding a bike.” An AI couldn’t do that of its own volition. That’s creativity. But before writing off the creation of the picture, let’s think more about what that really means. Works of art really have two sources: the idea itself and the technique required to instantiate that idea. You can have all the ideas you want, but if you can’t paint like Rembrandt, you’ll never generate a Dutch master. Throughout history, painters have learned technique by copying the works of masters. What’s interesting about DALL-E, Imagen, and their relatives is that they supply the technique. Using DALL-E or Imagen, I could create a painting of a tarsier eating an anaconda without knowing how to paint.

That distinction strikes me as very important. In the 20th and 21st centuries we’ve become very impatient with technique. We haven’t become impatient with creating good ideas. (Or at least strange ideas.) The “age of mechanical reproduction” seems to have made technique less relevant; after all, we’re heirs of the poet Ezra Pound, who famously said, “Make it new.”

But does that quote mean what we think? Pound’s “Make it new” has been traced back to 18th century China, and from there to the 12th century, something that’s not at all surprising if you’re familiar with Pound’s fascination with Chinese literature. What’s interesting, though, is that Chinese art has always focused on technique to a level that’s almost inconceivable to the European tradition. And “Make it new” has, within it, the acknowledgment that what’s new first has to be made. Creativity and technique don’t come apart that easily.

We can see that in other art forms. Beethoven broke Classical music and put it back together again, but different-–he’s the most radical composer in the Western tradition (except for, perhaps, Thelonious Monk). And it’s worth asking how we get from what’s old to what’s new.  AI has been used to complete Beethoven’s 10th symphony, for which Beethoven left a number of sketches and notes at the time of his death. The result is pretty good, better than the human attempts I’ve heard at completing the 10th. It sounds Beethoven-like; its flaw is that it goes on and on, repeating Beethoven-like riffs but without the tremendous forward-moving force that you get in Beethoven’s compositions. But completing the 10th isn’t the problem we should be looking at. How did we get Beethoven in the first place?  If you trained an AI on the music Beethoven was trained on, would you eventually get the 9th symphony? Or would you get something that sounds a lot like Mozart and Haydn?

I’m betting the latter. The progress of art isn’t unlike the structure of scientific revolutions, and Beethoven indeed took everything that was known, broke it apart, and put it back together differently. Listen to the opening of Beethoven’s 9th symphony: what is happening? Where’s the theme? It sounds like the orchestra is tuning up. When the first theme finally arrives, it’s not the traditional “melody” that pre-Beethoven listeners would have expected, but something that dissolves back into the sound of instruments tuning, then gets reformed and reshaped. Mozart would never do this. Or listen again to Beethoven’s 5th symphony, probably the most familiar piece of orchestral music in the world. That opening duh-duh-duh-DAH–what kind of theme is that? Beethoven builds this movement by taking that four note fragment, moving it around, changing it, breaking it into even smaller bits and reassembling them. You can’t imagine a witty, urbane, polite composer like Haydn writing music like this. But I don’t want to worship some notion of Beethoven’s “genius” that privileges creativity over technique. Beethoven could never have gotten beyond Mozart and Haydn (with whom Beethoven studied) without extensive knowledge of the technique of composing; he would have had some good ideas, but he would never have known how to realize them. Conversely, the realization of radical ideas as actual works of art inevitably changes the technique. Beethoven did things that weren’t conceivable to Mozart or Haydn, and they changed the way music was written: those changes made the music of Schubert, Schumann, and Brahms possible, along with the rest of the 19th century.

That brings us back to the question of computers, creativity, and craft. Systems like DALL-E and Imagen break apart the idea and the technique, or the execution of the idea. Does that help us be more creative, or less? I could tell Imagen to “paint a picture of a 15th century woman with an enigmatic smile,” and after a few thousand tries I might get something like the Mona Lisa. I don’t think that anyone would care, really.  But this isn’t creating something new; it’s reproducing something old. If I magically appeared early in the 20th century, along with a computer capable of running Imagen (though only trained on art through 1900), would I be able to tell it to create a Picasso or a Dali? I have no idea how to do that. Nor do I have any idea what the next step for art is now, in the 21st century, or how I’d ask Imagen to create it. It sure isn’t Bored Apes. And if I could ask Imagen or DALL-E to create a painting from the 22nd century, how would that change the AI’s conception of technique?

At least part of what I lack is the technique, for technique isn’t just mechanical ability; it’s also the ability to think the way great artists do. And that gets us to the big question:

Now that we have abstracted technique away from the artistic process, can we build interfaces between the creators of ideas and the machines of technique in a way that allows the creators to “make it new”?  That’s what we really want from creativity: something that didn’t exist, and couldn’t have existed, before.

Can artificial intelligence help us to be creative? That’s the important question, and it’s a question about user interfaces, not about who has the biggest model.

Categories: Technology

Radar Trends to Watch: July 2022

O'Reilly Radar - Tue, 2022/07/05 - 04:09

This month, large models are even more in the news than last month: the open source Bloom model is almost finished, Google’s LaMDA is good enough that it can trick people into thinking it’s sentient, and DALL-E has gotten even better at drawing what you ask.

The most important issue facing technology might now be the protection of privacy. While that’s not a new concern, it’s a concern that most computer users have been willing to ignore, and that most technology companies have been willing to let them ignore. New state laws that criminalize having abortions out of state and the stockpiling of location information by antiabortion groups have made privacy an issue that can’t be ignored.

Artificial Intelligence
  • Big Science has almost finished training its open source BLOOM language model, which was developed by volunteer researchers and trained using public funds. Bloom will provide an open, public platform for research into the capabilities of large language models and, specifically,  issues like avoiding bias and toxic language.
  • AI tools like AlphaFold2 can create new proteins, not just analyze existing ones; the unexpected creation of new artifacts by an AI system is playfully called “hallucination.” The proteins designed so far probably aren’t useful; still, this is a major step forward in drug design.
  • Microsoft is limiting or removing access to some features in its face recognition service, Azure Face. Organizations will have to tell Microsoft how and why facial recognition will be used in their systems; and services like emotion recognition will be removed completely.
  • Amazon plans to give Alexa the ability to imitate anyone’s voice, using under a minute of audio. They give the example of a (possibly dead) grandmother “reading” a book to a child. Other AI vendors (most notably OpenAI/Microsoft) have considered such mimicry unethical.
  • Dolt is a SQL database that lets you version data using git commands, You can clone, push, pull, fork, branch, and merge just as with git; you access data using standard SQL.
  • It’s sadly unsurprising that a robot incorporating a widely-used neural network (OpenAI CLIP) learns racist and sexist biases, and that these biases affect its performance on tasks.
  • Building autonomous vehicles with memory, so that they can learn about objects on the routes they drive, may be an important step in making AV practical. In real life, most people drive over routes they are already familiar with. Autonomous vehicles should have the same advantage.
  • The argument about whether Google’s LaMDA is “sentient” continues, with a Google engineer placed on administrative leave for publishing transcripts of conversations that he claimed demonstrate sentience. Or are large language models just squirrels?
  • For artists working in collaboration with AI, the possibilities and imperfections of AI are a means of extending their creativity.
  • Pete Warden’s proposal for ML Sensors could make developing embedded ML systems much simpler: push the machine learning into the sensors themselves.
  • Researchers using DALL-E 2 discovered that the model has a “secret vocabulary” that’s not human language, but that can be used somewhat reliably to create consistent pictures. It may be an artifact of the model’s inability to say “I didn’t understand that”; given nonsense input, it is pulled towards similar words in the training corpus.
  • HuggingFace has made an agreement with Microsoft that will allow Azure customers to run HuggingFace language models on the Azure platform.
  • The startup Predibase has built a declarative low-code platform for building AI systems. In a declarative system, you describe the outcome you want, rather than the process for creating the outcome. The system figures out the process.
  • Researchers are developing AI models that implement metamemory: the ability to remember whether or not you know something.
  • As the population ages, it will be more important to diagnose diseases like Alzheimer’s early, when treatment is still meaningful. AI is providing tools to help doctors analyze MRI images more accurately than humans. These tools don’t attempt diagnosis; they provide data about brain features.
  • Google has banned the training of Deepfakes on Colab, its free Jupyter-based cloud programming platform.
  • Samsung and RedHat are working on new memory architectures and device drivers that will be adequate to the demands of a 3D-enabled, cloud-based metaverse.
  • The Metaverse Standards Forum is a new industry group with the goal of solving interoperability problems for the Metaverse. It views the Metaverse as the outgrowth of the Web, and plans to coordinate work between existing standards groups (like the W3C) relevant to the Metaverse.
  • Can the “Open Metaverse” be the future of the Internet?  The Open Metaverse Interoperability Group is building vendor-independent standards for social graphs, identities, and other elements of a Metaverse.
  • Holographic heads-up displays allow for 3D augmented reality: the ability to project 3D images onto the real world (for example, onto a car’s windshield).
  • Google’s Visual Position Service uses the data they’ve collected through Street View to provide high-accuracy positioning data for augmented reality applications. (This may be related to Niantic’s VPS, or they may just be using the same acronym.)
Security Programming
  • Amazon has launched CodeWhisperer, a direct competitor to GitHub Copilot.
  • Linus Torvalds predicts that Rust will be used in the Linux kernel by 2023.
  • GitHub Copilot is now generally available (for a price); it’s free to students and open source maintainers. Corporate licenses will be available later this year.
  • WebAssembly is making inroads. The universal WebAssembly runtime, Wasmer, runs any code, on any platform. Impressive, if it delivers.
  • Can WebAssembly replace Docker? Maybe, in some applications. WASM provides portability and eliminates some security issues (possibly introducing its own); Docker sets up environments.
  • Mozilla’s Project Bergamot is an automated translation tool designed for use on the Web. It can be used to build multilingual forms and other web pages. Unlike most other AI technologies, Bergamot runs in the browser using WASM. No data is sent to the cloud.
  • Microsoft has released a framework called Fluid for building collaborative apps, such as Slack, Discord, and Teams. Microsoft will also be releasing Azure Fluid Relay to support Fluid-based applications.
  • Dragonfly is a new in-memory database that claims significantly faster performance than memcached and Redis.
  • The Chinese government has blocked access to open source code on Gitee, the Chinese equivalent to GitHub, saying that all code must be reviewed by the government before it can be released to the public.
  • Is Blockchain Decentralized? A study commissioned by DARPA investigates whether a blockchain is truly immutable, or whether it can be modified without exploiting cryptographic vulnerabilities, but by attacking the blockchain’s implementation, networking, and consensus protocols. This is the most comprehensive examination of blockchain security that we’ve seen.
  • Jack Dorsey has announced that he’s working on Web5, which will be focused on identity management and be based on Bitcoin.
  • Molly White’s post questioning the possibility of acceptably non-dystopian self-sovereign identity is a must-read; she has an excellent summary and critique of just about all the work going on in the field.
  • Cryptographer Matthew Green makes an important argument for the technologies behind cryptocurrency (though not for the current implementations).
Biology Quantum Computing
  • Probabilistic computers, built from probabilistic bits (p-bits), may provide a significant step forward for probabilistic decision making. This sounds esoteric, but it’s essentially what we’re asking AI systems to do. P-bits may also be able to simulate q-bits and quantum computing.
  • A system that links two time crystals could be the basis for a new form of quantum computing. Time crystals can exist at room temperature, and remain coherent for much longer than existing qubit technologies.
Categories: Technology

2022 Cloud Salary Survey

O'Reilly Radar - Wed, 2022/06/22 - 04:21

Last year, our report on cloud adoption concluded that adoption was proceeding rapidly; almost all organizations are using cloud services. Those findings confirmed the results we got in 2020: everything was “up and to the right.” That’s probably still true—but saying “everything is still up and to the right” would be neither interesting nor informative. So rather than confirming the same results for a third year, we decided to do something different.

This year’s survey asked questions about compensation for “cloud professionals”: the software developers, operations staff, and others who build cloud-based applications, manage a cloud platform, and use cloud services. We limited the survey to residents of the United States because salaries from different countries aren’t directly comparable; in addition to fluctuating exchange rates, there are different norms for appropriate compensation. This survey ran from April 4 through April 15, 2022, and was publicized via email to recipients of our Infrastructure & Ops Newsletter whom we could identify as residing in the United States or whose location was unknown.

Executive Summary
  • Survey respondents earn an average salary of $182,000.
  • The average salary increase over the past year was 4.3%.
  • 20% of respondents reported changing employers in the past year.
  • 25% of respondents are planning to change employers because of compensation.
  • The average salary for women is 7% lower than the average salary for men.
  • 63% of respondents work remotely all the time; 94% work remotely at least one day a week.
  • Respondents who participated in 40 or more hours of training in the past year received higher salary increases.

Of the 1,408 responses we initially received, 468 were disqualified. Respondents were disqualified (and the survey terminated) if the respondent said they weren’t a US resident or if they were under 18 years old; respondents were also disqualified if they said they weren’t involved with their organization’s use of cloud services. Another 162 respondents filled out part of the survey but didn’t complete it; we chose to include only complete responses. That left us with 778 responses. Participants came from 43 states plus Washington, DC. As with our other surveys, the respondents were a relatively senior group: the average age was 47 years old, and while the largest number identified themselves as programmers (43%), 14% identified as executives and 33% as architects.

The Big Picture

Cloud professionals are well paid. That’s not a surprise in itself. We expected salaries (including bonuses) to be high, and they were. The cloud professionals who responded to our survey earn an average salary of $182,000; the most common salary range among respondents was $150,000 to $175,000 per year (16% of the total), as shown in Figure 1. The peak was fairly broad: 68% of the respondents earn between $100,000 and $225,000 per year. And there was a significant “long tail” in the compensation stratosphere: 7% of the respondents earn over $300,000 per year, and 2.4% over $400,000 per year.

Figure 1. Annual salary by percentage of respondents

We believe that job changes are part of what’s driving high salaries. After all, we’ve heard about talent shortages in almost every field, with many employers offering very high salaries to attract the staff they need. By staying with their current employer, an employee may get an annual salary increase of 4%. But if they change jobs, they might get a significantly higher offer—20% or more—plus a signing bonus.

20% of the respondents reported that they changed employers in the past year. That number isn’t high in and of itself, but it looks a lot higher when you add it to the 25% who are planning to leave jobs over compensation. (Another 20% of the respondents declined to answer this question.) It’s also indicative that 19% of the respondents received promotions. There was some overlap between those who received promotions and those who changed jobs (5% of the total said “yes” to both questions, or roughly one quarter of those who changed jobs). When you look at the number of respondents who left their employer, are planning to leave their employer, or got a promotion and a salary increase, it’s easy to see why salary budgets are under pressure. Right now, qualified candidates have the power in the job market, though with the stock market correction that began in March 2022 and significant layoffs from some large technology-sector companies, that may be changing.

These conclusions are borne out when you look at the salaries of those who were promoted, changed jobs, or intend to change jobs. A promotion roughly doubled respondents’ year-over-year salary increase. On the average, those who were promoted received a 7% raise; those who weren’t promoted received a 3.7% increase. The result was almost exactly the same for those who changed jobs: those who changed averaged a 6.8% salary increase, while those who remained averaged 3.7%. We also see a difference in the salaries of those who intend to leave because of compensation: their average salary is $171,000, as opposed to $188,000 for those who didn’t plan to leave. That’s a $17,000 difference, or roughly 10%.

Salaries by Gender

One goal of this survey was to determine whether women are being paid fairly. Last year’s salary survey for data and AI found a substantial difference between men’s and women’s salaries: women were paid 16% less than men. Would we see the same here?

The quick answer is “yes,” but the difference was smaller. Average salaries for women are 7% lower than for men ($172,000 as opposed to $185,000). But let’s take a step back before looking at salaries in more detail. We asked our respondents what pronouns they use. Only 8.5% said “she,” while 79% chose “he.” That’s still only 87% of the total. Where are the rest? 12% preferred not to say; this is a larger group than those who used “she.” 0.5% chose “other,” and 0.7% chose “they.” (That’s only four and six respondents, respectively.) Compared to results from our survey on the data/AI industry, the percentage of cloud professionals who self-identified as women appears to be much smaller (8.5%, as opposed to 14%). But there’s an important difference between the surveys: “I prefer not to answer” wasn’t an option for the Data/AI Salary Survey. We can’t do much with those responses. When we eyeballed the data for the “prefer not to say” group, we saw somewhat higher salaries than for women, but still significantly less (5% lower) than for men.

The difference between men’s and women’s salaries is smaller than we expected, given the results of last year’s Data/AI Salary Survey. But it’s still a real difference, and it begs the question: Is compensation improving for women? Talent shortages are driving compensation up in many segments of the software industry. Furthermore, the average reported salaries for both men and women in our survey are high. Again, is that a consequence of the talent shortage? Or is it an artifact of our sample, which appears to be somewhat older, and rich in executives? We can’t tell from a single year’s data, and the year-over-year comparison we made above is based on a different industry segment. But the evidence suggests that the salary gap is closing, and progress is being made. And that is indeed a good thing.

Salaries for respondents who answered “other” to the question about the pronouns they use are 31% lower than salaries for respondents who chose “he.” Likewise, salaries for respondents who chose “they” are 28% lower than men’s average salaries. However, both of these groups are extremely small, and in both groups, one or two individuals pulled the averages down. We could make the average salaries higher by calling these individuals “outliers” and removing their data; after all, outliers can have outsized effects on small groups. That’s a step we won’t take. Whatever the reason, the outliers are there; they’re part of the data. Professionals all across the spectrum have low-paying jobs—sometimes by choice, sometimes out of necessity. Why does there appear to be a concentration of them among people who don’t use “he” or “she” as their pronouns? The effect probably isn’t quite as strong as our data indicates, but we won’t try to explain our data away. It’s certainly indicative that the groups that use “they” or another pronoun than “he” or “she” showed a salary penalty. We have to conclude that respondents who use nonbinary pronouns earn lower salaries, but without more data, we don’t know why, nor do we know how much lower their salaries are or whether this difference would disappear with a larger sample.

To see more about the differences between men’s and women’s salaries, we looked at the men and women in each salary range. The overall shapes of the salary distributions are clear: a larger percentage of women earn salaries between $0 and $175,000, and (with two exceptions) a larger percentage of men earn salaries over $175,000. However, a slightly larger percentage of women earn supersize salaries ($400,000 or more), and a significantly larger percentage earn salaries between $225,000 and $250,000 (Figure 2).

Figure 2. Men’s and women’s salaries by percentage of respondents

We can get some additional information by looking at salary increases (Figure 3). On average, women’s salary increases were higher than men’s: $9,100 versus $8,100. That doesn’t look like a big difference, but it’s over 10%. We can read that as a sign that women’s salaries are certainly catching up. But the signals are mixed. Men’s salaries increased more than women’s in almost every segment, with two big exceptions: 12% of women received salary increases over $30,000, while only 8% of men did the same. Likewise, 17% of women received increases between $10,000 and $15,000, but only 9% of men did. These differences might well disappear with more data.

Figure 3. Salary increases for women and men by percentage of respondents

When we look at salary increases as a percentage of salary, we again see mixed results (Figure 4). Women’s salary increases were much larger than men’s in three bands: over $325,000 (with the exception of $375,000–$400,000, where there were no women respondents), $275,000–$300,000, and $150,000–$175,000. For those with very large salaries, women’s salary increases were much higher than men’s. Furthermore, the $150,000–$175,000 band had the largest number of women. While there was a lot of variability, salary increases are clearly an important factor driving women’s salaries toward parity with men’s.

Figure 4. Salary increases as a percentage of salary The Effect of Education

The difference between men’s and women’s salaries is significant at almost every educational level (Figure 5). The difference is particularly high for respondents who are self-taught, where women earned 39% less ($112,000 versus $184,000), and for students (45% less, $87,000 versus $158,000). However, those were relatively small groups, with only two women in each group. It’s more important that for respondents with bachelor’s degrees, women’s salaries were 4% higher than men’s ($184,000 versus $176,000)—and this was the largest group in our survey. For respondents with advanced degrees, women with doctorates averaged a 15% lower salary than men with equivalent education; women with master’s degrees averaged 10% lower. The difference between women’s and men’s salaries appears to be greatest at the extremes of the educational spectrum.

Figure 5. Men’s and women’s salaries by degree Salaries by State

Participants in the survey come from 43 states plus Washington, DC. Looking at salaries by state creates some interesting puzzles. The highest salaries are found in Oklahoma; South Dakota is third, following California. And the top of the list is an interesting mix of states where we expected high salaries (like New York) and states where we expected salaries to be lower. So what’s happening?

The average salary from Oklahoma is $225,000—but that only reflects two respondents, both of whom work remotely 100% of the time. (We’ll discuss remote work later in this report.) Do they work for a Silicon Valley company and get a Silicon Valley salary? We don’t know, but that’s certainly a possibility. The average salary for South Dakota is $212,000, but we shouldn’t call it an “average,” because we only had one response, and this respondent reported working remotely 1–4 days per week. Likewise, Vermont had a single respondent, who works remotely and who also had an above-average salary. Many other states have high average salaries but a very small number of respondents.

So the first conclusion that we can draw is that remote work might be making it possible for people in states without big technology industries to get high salaries. Or it could be the opposite: there’s no state without some businesses using the cloud, and the possibility of remote work puts employers in those states in direct competition with Silicon Valley salaries: they need to pay much higher salaries to get the expertise they need. And those job offers may include the opportunity to work remotely full or part time—even if the employer is local. Both of those possibilities no doubt hold true for individuals, if not for geographical regions as a whole.

Outliers aside, salaries are highest in California ($214,000), New York ($212,000), Washington ($203,000), Virginia ($195,000), and Illinois ($191,000). Massachusetts comes next at $189,000. At $183,000, average salaries in Texas are lower than we’d expect, but they’re still slightly above the national average ($182,000). States with high average salaries tended to have the largest numbers of respondents—with the important exceptions that we’ve already noted. The lowest salaries are found in West Virginia ($87,000) and New Mexico ($84,000), but these reflected a small number of respondents (one and four, respectively). These two states aside, the average salary in every state was over $120,000 (Figure 6).

So, is remote work equalizing salaries between different geographical regions? It’s still too early to say. We don’t think there will be a mass exodus from high-salary states to more rural states, but it’s clear that professionals who want to make that transition can, and that companies that aren’t in high-salary regions will need to offer salaries that compete in the nationwide market. Future surveys will tell us whether this pattern holds true.

Figure 6. Average salary by state Salaries by Age

The largest group of respondents to our survey were between 45 and 54 years old (Figure 7). This group also had the highest average salary ($196,000). Salaries for respondents between 55 and 65 years old were lower (averaging $173,000), and salaries dropped even more for respondents over 65 ($139,000). Salaries for the 18- to 24-year-old age range were low, averaging $87,000. These lower salaries are no surprise because this group includes both students and those starting their first jobs after college.

It’s worth noting that our respondents were older than we expected; 29% were between 35 and 44 years old, 36% were between 45 and 54, and 22% were between 55 and 64. Data from our learning platform shows that this distribution isn’t indicative of the field as a whole, or of our audience. It may be an artifact of the survey itself. Are our newsletter readers older, or are older people more likely to respond to surveys? We don’t know.

Figure 7. Average salary by age

The drop in salaries after age 55 is surprising. Does seniority count for little? It’s easy to make hypotheses: Senior employees are less likely to change jobs, and we’ve seen that changing jobs drives higher salaries. But it’s also worth noting that AWS launched in 2002, roughly 20 years ago. People who are now 45 to 54 years old started their careers in the first years of Amazon’s rollout. They “grew up” with the cloud; they’re the real cloud natives, and that appears to be worth something in today’s market.

Job Titles and Roles

Job titles are problematic. There’s no standardized naming system, so a programming lead at one company might be an architect or even a CTO at another. So we ask about job titles at a fairly high level of abstraction. We offered respondents a choice of four “general” roles: executive, director, manager, or associate. We also allowed respondents to write in their own job titles; roughly half chose this option. The write-in titles were more descriptive and, as expected, inconsistent. We were able to group them into some significant clusters by looking for people whose write-in title used the words “engineer,” “programmer,” “developer,” “architect,” “consultant,” or “DevOps.” We also looked at two modifiers: “senior” and “lead.” There’s certainly room for overlap: someone could be a “senior DevOps engineer.” But in practice, overlap was small. (For example, no respondents used both “developer” and “architect” in a write-in job title.) There was no overlap between the titles submitted by respondents and the general titles we offered on the survey: our respondents had to choose one or the other.

So what did we see? As shown in Figure 8, the highest salaries go to those who classified themselves as directors ($235,000) or executives ($231,000). Salaries for architects, “leads,” and managers are on the next tier ($196,000, $190,000, and $188,000, respectively). People who identified as engineers earn slightly lower salaries ($175,000). Associates, a relatively junior category, earn an average of $140,000 per year. Those who used “programmer” in their job title are a puzzle. There were only three of them, which is a surprise in itself, and all have salaries in the $50,000 to $100,000 range (average $86,000). Consultants also did somewhat poorly, with an average salary of $129,000.

Those who identified as engineers (19%) made up the largest group of respondents, followed by associates (18%). Directors and managers each comprised 15% of the respondents. That might be a bias in our survey, since it’s difficult to believe that 30% of cloud professionals have directorial or managerial roles. (That fits the observation that our survey results may skew toward older participants.) Architects were less common (7%). And relatively few respondents identified themselves with the terms “DevOps” (2%), “consultant” (2%), or “developer” (2%). The small number of people who identify with DevOps is another puzzle. It’s often been claimed that the cloud makes operations teams unnecessary; “NoOps” shows up in discussions from time to time. But we’ve never believed that. Cloud deployments still have a significant operational component. While the cloud may allow a smaller group to oversee a huge number of virtual machines, managing those machines has become more complex—particularly with cloud orchestration tools like Kubernetes.

Figure 8. Average salary by job title

We also tried to understand what respondents are doing at work by asking about job roles, decoupling responsibilities from titles (Figure 9). So in another question, we asked respondents to choose between marketing, sales, product, executive, programmer, and architect roles, with no write-in option. Executives earn the highest salaries ($237,000) but were a relatively small group (14%). Architects are paid $188,000 per year on average; they were 33% of respondents. And for this question, respondents didn’t hesitate to identify as programmers: this group was the largest (43%), with salaries somewhat lower than architects ($163,000). This is roughly in agreement with the data we got from job titles. (And we should have asked about operations staff. Next year, perhaps.)

The remaining three groups—marketing, sales, and product—are relatively small. Only five respondents identified their role as marketing (0.6%), but they were paid well ($187,000). 1.5% of the respondents identified as sales, with an average salary of $186,000. And 8% of the respondents identified themselves with product, with a somewhat lower average salary of $162,000.

Figure 9. Average salary by role Working from Home

When we were planning this survey, we were very curious about where people worked. Many companies have moved to a fully remote work model (as O’Reilly has), and many more are taking a hybrid approach. But just how common is remote work? And what consequences does it have for the employees who work from home rather than in an office?

It turns out that remote work is surprisingly widespread (Figure 10). We found that only 6% of respondents answered no to the question “Do you work remotely?” More than half (63%) said that they work remotely all the time, and the remainder (31%) work remotely 1–4 days per week.

Working remotely is also associated with higher salaries: the average salary for people who work remotely 1–4 days a week is $188,000. It’s only slightly less ($184,000) for people who work remotely all the time. Salaries are sharply lower for people who never work remotely (average $131,000).

Figure 10. Salaries and remote work

Salary increases show roughly the same pattern (Figure 11). While salaries are slightly higher for respondents who occasionally work in the office, salary increases were higher for those who are completely remote: the average increase was $8,400 for those who are remote 100% of the time, while those who work from home 1–4 days per week only averaged a $7,800 salary increase. We suspect that given time, these two groups would balance out. Salary changes for those who never work remotely were sharply lower ($4,500).

Of all jobs in the computing industry, cloud computing is probably the most amenable to remote work. After all, you’re working with systems that are remote by definition. You’re not reliant on your own company’s data center. If the application crashes in the middle of the night, nobody will be rushing to the machine room to reboot the server. A laptop and a network connection are all you need.

Figure 11. Salary increases and remote work

We’re puzzled by the relatively low salaries and salary increases for those who never work remotely. While there were minor differences, as you’d expect, there were no “smoking guns”: no substantial differences in education or job titles or roles. Does this difference reflect old-school companies that don’t trust their staff to be productive at home? And do they pay correspondingly lower salaries? If so, they’d better be forewarned: it’s very easy for employees to change jobs in the current labor market.

As the pandemic wanes (if indeed it wanes—despite what people think, that’s not what the data shows), will companies stick with remote work or will they require employees to come back to the office? Some companies have already asked their employees to return. But we believe that the trend toward remote work will be hard, if not impossible, to reverse, especially in a job market where employers are competing for talent. Remote work certainly raises issues about onboarding new hires, training, group dynamics, and more. And it’s not without problems for the employees themselves: childcare, creating appropriate work spaces, etc. These challenges notwithstanding, it’s difficult to imagine people who have eliminated a lengthy commute from their lives going back to the office on a permanent basis.

Certifications and Training

Nearly half (48%) of our respondents participated in technical training or certification programs in the last year. 18% of them obtained one or more certifications, suggesting that 30% participated in training or some other form of professional development that wasn’t tied to a certification program.

The most common reasons for participating in training were learning new technologies (42%) and improving existing skills (40%). (Percentages are relative to the total number of respondents, which was 778.) 21% wanted to work on more interesting projects. The other possible responses were chosen less frequently: 9% of respondents wanted to move into a leadership role, and 12% were required to take training. Job security was an issue for 4% of the respondents, a very small minority. That’s consistent with our observation that employees have the upper hand in the labor market and are more concerned with advancement than with protecting their status quo.

Survey participants obtained a very broad range of certifications. We asked specifically about 11 cloud certifications that we identified as being particularly important. Most were specific to one of the three major cloud vendors: Microsoft Azure, Amazon Web Services, and Google Cloud. However, the number of people who obtained any specific certification was relatively small. The most popular certifications were AWS Certified Cloud Practitioner and Solutions Architect (both 4% of the total number of respondents). However, 8% of respondents answered “other” and provided a write-in answer. That’s 60 respondents—and we got 55 different write-ins. Obviously, there was very little duplication. The only submissions with multiple responses were CKA (Certified Kubernetes Administrator) and CKAD (Certified Kubernetes Application Developer). The range of training in this “other” group was extremely broad, spanning various forms of Agile training, security, machine learning, and beyond. Respondents were pursuing many vendor-specific certifications, and even academic degrees. (It’s worth noting that our 2021 Data/AI Salary Surveyreport also concluded that earning a certification for one of the major cloud providers was a useful tool for career advancement.)

Given the number of certifications that are available, this isn’t surprising. It’s somewhat more surprising that there isn’t any consensus on which certifications are most important. When we look at salaries, though, we see some signals…at least among the leading certifications. The largest salaries are associated with Google Cloud Certified Professional Cloud Architect ($231,000). People who earned this certification also received a substantial salary increase (7.1%). Those who obtained an AWS Certified Solutions Architect – Professional, AWS Certified Solutions Architect – Associate, or Microsoft Certified: Azure Solutions Architect Expert certification also earn very high salaries ($212,000, $201,000, and $202,000, respectively), although these three received smaller salary increases (4.6%, 4.4%, and 4.0%, respectively). Those who earned the CompTIA Cloud+ certification receive the lowest salary ($132,000) and got a relatively small salary increase (3.5%). The highest salary increase went to those who obtained the Google Cloud Certified Professional Cloud DevOps Engineer certification (9.7%), with salaries in the middle of the range ($175,000).

We can’t draw any conclusions about the salaries or salary increases corresponding to the many certifications listed among the “other” responses; most of those certifications only appeared once. But it seems clear that the largest salaries and salary increases go to those who are certified for one of the big three platforms: Google Cloud, AWS, and Microsoft Azure (Figures 12 and 13).

The salaries and salary increases for the two Google certifications are particularly impressive. Given that Google Cloud is the least widely used of the major platforms, and that the number of respondents for these certifications was relatively small, we suspect that talent proficient with Google’s tools and services is harder to find and drives the salaries up.

Figure 12. Average salary by certification Figure 13. Average salary increase by certification

Our survey respondents engaged in many different types of training. The most popular were watching videos and webinars (41%), reading books (39%), and reading blogs and industry articles (34%). 30% of the respondents took classes online. Given the pandemic, it isn’t at all surprising that only 1.7% took classes in person. 23% attended conferences, either online or in person. (We suspect that the majority attended online.) And 24% participated in company-offered training.

There’s surprisingly little difference between the average salaries associated with each type of learning. That’s partly because respondents were allowed to choose more than one response. But it’s also notable that the average salaries for most types of learning are lower than the average salary for the respondents as a whole. The average salary by type of learning ranges from $167,000 (in-person classes) to $184,000 (company-provided educational programs). These salaries are on the low side compared to the overall average of $182,000. Lower salaries may indicate that training is most attractive to people who want to get ahead in their field. This fits the observation that most of the people who participated in training did so to obtain new skills or to improve current ones. After all, to many companies “the cloud” is still relatively new, and they need to retrain their current workforces.

When we look at the time that respondents spent in training (Figure 14), we see that the largest group spent 20–39 hours in the past year (13% of all the respondents). 12% spent 40–59 hours; and 10% spent over 100 hours. No respondents reported spending 10–19 hours in training. (There were also relatively few in the 80–99 hour group, but we suspect that’s an artifact of “bucketing”: if you’ve taken 83 hours of training, you’re likely to think, “I don’t know how much time I spent in training, but it was a lot,” and choose 100+.) The largest salary increases went to those who spent 40–59 hours in training, followed by those who spent over 100 hours; the smallest salary increases, and the lowest salaries, went to those who only spent 1–9 hours in training. Managers take training into account when planning compensation, and those who skimp on training shortchange themselves.

Figure 14. Percentage salary increase by time spent in training The Cloud Providers

A survey of this type wouldn’t be complete without talking about the major cloud providers. There’s no really big news here (Figure 15). Amazon Web Services has the most users, at 72%, followed by Microsoft Azure (42%) and Google Cloud (31%). Compared to the cloud survey we did last year, it looks like Google Cloud and Azure have dropped slightly compared to AWS. But the changes aren’t large. Oracle’s cloud offering was surprisingly strong at 6%, and 4% of the respondents use IBM Cloud.

When we look at the biggest cloud providers that aren’t based in the US, we find that they’re still a relatively small component of cloud usage: 0.6% of respondents use Alibaba, while 0.3% use Tencent. Because there are so few users among our respondents, the percentages don’t mean much: a few more users, and we might see something completely different. That said, we expected to see more users working with Alibaba; it’s possible that tensions between the United States and China have made it a less attractive option.

20% of the respondents reported using a private cloud. While it’s not entirely clear what the term “private cloud” means—for some, it just means a traditional data center—almost all the private cloud users also reported using one of the major cloud providers. This isn’t surprising; private clouds make the most sense as part of a hybrid or multicloud strategy, where the private cloud holds data that must be kept on premises for security or compliance reasons.

6% of the respondents reported using a cloud provider that we didn’t list. These answers were almost entirely from minor cloud providers, which had only one or two users among the survey participants. And surprisingly, 4% of the respondents reported that they weren’t using any cloud provider.

Figure 15. Cloud provider usage by percentage of respondents

There’s little difference between the salaries reported by people using the major providers (Figure 16). Tencent stands out; the average salary for its users is $275,000. But there were so few Tencent users among the survey respondents that we don’t believe this average is meaningful. There appears to be a slight salary premium for users of Oracle ($206,000) and Google ($199,000); since these cloud providers aren’t as widely used, it’s easy to assume that organizations committed to them are willing to pay slightly more for specialized talent, a phenomenon we’ve observed elsewhere. Almost as a footnote, we see that the respondents who don’t use a cloud have significantly lower salaries ($142,000).

Figure 16. Average salary by cloud provider

Cloud providers offer many services, but their basic services fall into a few well-defined classes (Figure 17). 75% of the survey respondents reported using virtual instances (for example, AWS EC2), and 74% use bucket storage (for example, AWS S3). These are services that are offered by every cloud provider. Most respondents use an SQL database (59%). Somewhat smaller numbers reported using a NoSQL database (41%), often in conjunction with an SQL database. 49% use container orchestration services; 45% use “serverless,” which suggests that serverless is more popular than we’ve seen in our other recent surveys.

Only 11% reported using some kind of AutoML—again, a service that’s provided by all the major cloud providers, though under differing names. And again, we saw no significant differences in salary based on what services were in use. That makes perfect sense; you wouldn’t pay a carpenter more for using a hammer than for using a saw.

Figure 17. Basic cloud services usage by percentage of respondents The Work Environment

Salaries aside, what are cloud developers working with? What programming languages and tools are they using?


Python is the most widely used language (59% of respondents), followed by SQL (49%), JavaScript (45%), and Java (32%). It’s somewhat surprising that only a third of the respondents use Java, given that programming language surveys done by TIOBE and RedMonk almost always have Java, Python, and JavaScript in a near tie for first place. Java appears not to have adapted well to the cloud (Figure 18).

Salaries also follow a pattern that we’ve seen before. Although the top four languages are in high demand, they don’t command particularly high salaries: $187,000 for Python, $179,000 for SQL, $181,000 for JavaScript, and $188,000 for Java (Figure 19). These are all “table stakes” languages: they’re necessary and they’re what most programmers use on the job, but the programmers who use them don’t stand out. And despite the necessity, there’s a lot of talent available to fill these roles. As we saw in last year’s Data/AI Salary Survey report, expertise in Scala, Rust, or Go commands a higher salary ($211,000, $202,000, and $210,000, respectively). While the demand for these languages isn’t as high, there’s a lot less available expertise. Furthermore, fluency in any of these languages shows that a programmer has gone considerably beyond basic competence. They’ve done the work necessary to pick up additional skills.

Figure 18. Programming language usage by percentage of respondents

The lowest salaries were reported by respondents using PHP ($155,000). Salaries for C, C++, and C# are also surprisingly low ($170,000, $172,000, and $170,000, respectively); given the importance of C and C++ for software development in general and the importance of C# for the Microsoft world, we find it hard to understand why.

Almost all of the respondents use multiple languages. If we had to make a recommendation for someone who wanted to move into cloud development or operations, or for someone planning a cloud strategy from scratch, it would be simple: focus on SQL plus one of the other table stakes languages (Java, JavaScript, or Python). If you want to go further, pick one of the languages associated with the highest salaries. We think Scala is past its peak, but because of its strong connection to the Java ecosystem, Scala makes sense for Java programmers. For Pythonistas, we’d recommend choosing Go or Rust.

Figure 19. Average salary by programming language Operating Systems

We asked our survey participants which operating systems they used so we could test something we’ve heard from several people who hire software developers: Linux is a must. That appears to be the case: 80% of respondents use Linux (Figure 20). Even though Linux really hasn’t succeeded in the desktop market (sorry), it’s clearly the operating system for most software that runs in the cloud. If Linux isn’t a requirement, it’s awfully close.

67% of the respondents reported using macOS, but we suspect that’s mostly as a desktop or laptop operating system. Of the major providers, only AWS offers macOS virtual instances, and they’re not widely used. (Apple’s license only allows macOS to run on Apple hardware, and only AWS provides Apple servers.) 57% of the respondents reported using some version of Windows. While we suspect that Windows is also used primarily as a desktop or laptop operating system, Windows virtual instances are available from all the major providers, including Oracle and IBM.

Figure 20. Operating system usage by percentage of respondents Tools

We saw little variation in salary from tool to tool. This lack of variation makes sense. As we said above, we don’t expect a carpenter who uses a hammer to be paid more than a carpenter who uses a saw. To be a competent carpenter, you need to use both, along with levels, squares, and a host of other tools.

However, it is interesting to know what tools are commonly in use (Figure 21). There aren’t any real surprises. Docker is almost universal, used by 76% of the respondents. Kubernetes use is very widespread, by 61% of the respondents. Other components of the Kubernetes ecosystem didn’t fare as well: 27% of respondents reported using Helm, and 12% reported using Istio, which has been widely criticized for being too complex.

Alternatives to this core cluster of tools don’t appear to have much traction. 10% of the respondents reported using OpenShift, the IBM/Red Hat package that includes Kubernetes and other core components. Our respondents seem to prefer building their tooling environment themselves. Podman, an alternative to Docker and a component of OpenShift, is only used by 8% of the respondents. Unfortunately, we didn’t ask about Linkerd, which appears to be establishing itself as a service mesh that’s simpler to configure than Istio. However, it didn’t show up among the write-in responses, and the number of respondents who said “other” was relatively small (9%).

The HashiCorp tool set (Terraform, Consul, and Vault) appears to be more widely used: 41% of the respondents reported using Terraform, 17% use Vault, and 8% use Consul. However, don’t view these as alternatives to Kubernetes. Terraform is a tool for building and configuring cloud infrastructure, and Vault is a secure repository for secrets. Only Consul competes directly.

Figure 21. Tool usage by percentage of respondents The Biggest Impact

Finally, we asked the respondents what would have the biggest impact on compensation and promotion. The least common answer was “data tools” (6%). This segment of our audience clearly isn’t working directly with data science or AI—though we’d argue that might change as more machine learning applications reach production. “Programming languages” was second from the bottom. The lack of concern about programming languages reflects reality. While we observed higher salaries for respondents who used Scala, Rust, or Go, if you’re solidly grounded in the basics (like Python and SQL), you’re in good shape. There’s limited value in pursuing additional languages once you have the table stakes.

The largest number of respondents said that knowledge of “cloud and containers” would have the largest effect on compensation. Again, containers are table stakes, as we saw in the previous section. Automation, security, and machine learning were also highly rated (18%, 15%, and 16%, respectively). It’s not clear why machine learning was ranked highly but data tools wasn’t. Perhaps our respondents interpreted “data tools” as software like Excel, R, and pandas.

11% of the respondents wrote in an answer. As usual with write-ins, the submissions were scattered, and mostly singletons. However, many of the write-in answers pointed toward leadership and management skills. Taken all together, these varied responses add up to about 2% of the total respondents. Not a large number, but still a signal that some part of our audience is thinking seriously about IT leadership.

Confidence in the Future

“Cloud adoption is up and to the right”? No, we already told you we weren’t going to conclude that. Though it’s no doubt true; we don’t see cloud adoption slowing in the near future.

Salaries are high. That’s good for employees and difficult for employers. It’s common for staff to jump to another employer offering a higher salary and a generous signing bonus. The current stock market correction may put a damper on that trend. There are signs that Silicon Valley’s money supply is starting to dry up, in part because of higher interest rates but also because investors are nervous about how the online economy will respond to regulation, and impatient with startups whose business plan is to lose billions “buying” a market before they figure out how to make money. Higher interest rates and nervous investors could mean an end to skyrocketing salaries.

The gap between women’s and men’s salaries has narrowed, but it hasn’t closed. While we don’t have a direct comparison for the previous year, last year’s Data/AI Salary Surveyreport showed a 16% gap. In this survey, the gap has been cut to 7%, and women are receiving salary increases that are likely to close that gap even further. It’s anyone’s guess how this will play out in the future. Talent is in short supply, and that puts upward pressure on salaries. Next year, will we see women’s salaries on par with men’s? Or will the gap widen again when the talent shortage isn’t so acute?

While we aren’t surprised by the trend toward remote work, we are surprised at how widespread remote work has become: as we saw, only 10% of our survey respondents never work remotely, and almost two-thirds work remotely full time. Remote work may be easier for cloud professionals, because part of their job is inherently remote. However, after seeing these results, we’d predict similar numbers for other industry sectors. Remote work is here to stay.

Almost half of our survey respondents participated in some form of training in the past year. Training on the major cloud platforms (AWS, Azure, and Google Cloud) was associated with higher salaries. However, our participants also wrote in 55 “other” kinds of training and certifications, of which the most popular was CKA (Certified Kubernetes Administrator).

Let’s end by thinking a bit more about the most common answer to the question “What area do you feel will have the biggest impact on compensation and promotion in the next year?”: cloud and containers. Our first reaction is that this is a poorly phrased option; we should have just asked about containers. Perhaps that’s true, but there’s something deeper hidden in this answer. If you want to get ahead in cloud computing, learn more about the cloud. It’s tautological, but it also shows some real confidence in where the industry is heading. Cloud professionals may be looking for their next employer, but they aren’t looking to jump ship to the “next big thing.” Businesses aren’t jumping away from the cloud to “the next big thing” either; whether it’s AI, the “metaverse,” or something else, their next big thing will be built in the cloud. And containers are the building blocks of the cloud; they’re the foundation on which the future of cloud computing rests. Salaries are certainly “up and to the right,” and we don’t see demand for cloud-capable talent dropping any time in the near future.

Categories: Technology

“Sentience” is the Wrong Question

O'Reilly Radar - Tue, 2022/06/21 - 06:30

On June 6, Blake Lemoine, a Google engineer, was suspended by Google for disclosing a series of conversations he had with LaMDA, Google’s impressive large model, in violation of his NDA. Lemoine’s claim that LaMDA has achieved “sentience” was widely publicized–and criticized–by almost every AI expert. And it’s only two weeks after Nando deFreitas, tweeting about DeepMind’s new Gato model, claimed that artificial general intelligence is only a matter of scale. I’m with the experts; I think Lemoine was taken in by his own willingness to believe, and I believe DeFreitas is wrong about general intelligence. But I also think that “sentience” and “general intelligence” aren’t the questions we ought to be discussing.

The latest generation of models is good enough to convince some people that they are intelligent, and whether or not those people are deluding themselves is beside the point. What we should be talking about is what responsibility the researchers building those models have to the general public. I recognize Google’s right to require employees to sign an NDA; but when a technology has implications as potentially far-reaching as general intelligence, are they right to keep it under wraps?  Or, looking at the question from the other direction, will developing that technology in public breed misconceptions and panic where none is warranted?

Google is one of the three major actors driving AI forward, in addition to OpenAI and Facebook. These three have demonstrated different attitudes towards openness. Google communicates largely through academic papers and press releases; we see gaudy announcements of its accomplishments, but the number of people who can actually experiment with its models is extremely small. OpenAI is much the same, though it has also made it possible to test-drive models like GPT-2 and GPT-3, in addition to building new products on top of its APIs–GitHub Copilot is just one example. Facebook has open sourced its largest model, OPT-175B, along with several smaller pre-built models and a voluminous set of notes describing how OPT-175B was trained.

I want to look at these different versions of “openness” through the lens of the scientific method. (And I’m aware that this research really is a matter of engineering, not science.)  Very generally speaking, we ask three things of any new scientific advance:

  • It can reproduce past results. It’s not clear what this criterion means in this context; we don’t want an AI to reproduce the poems of Keats, for example. We would want a newer model to perform at least as well as an older model.
  • It can predict future phenomena. I interpret this as being able to produce new texts that are (as a minimum) convincing and readable. It’s clear that many AI models can accomplish this.
  • It is reproducible. Someone else can do the same experiment and get the same result. Cold fusion fails this test badly. What about large language models?

Because of their scale, large language models have a significant problem with reproducibility. You can download the source code for Facebook’s OPT-175B, but you won’t be able to train it yourself on any hardware you have access to. It’s too large even for universities and other research institutions. You still have to take Facebook’s word that it does what it says it does. 

This isn’t just a problem for AI. One of our authors from the 90s went from grad school to a professorship at Harvard, where he researched large-scale distributed computing. A few years after getting tenure, he left Harvard to join Google Research. Shortly after arriving at Google, he blogged that he was “working on problems that are orders of magnitude larger and more interesting than I can work on at any university.” That raises an important question: what can academic research mean when it can’t scale to the size of industrial processes? Who will have the ability to replicate research results on that scale? This isn’t just a problem for computer science; many recent experiments in high-energy physics require energies that can only be reached at the Large Hadron Collider (LHC). Do we trust results if there’s only one laboratory in the world where they can be reproduced?

That’s exactly the problem we have with large language models. OPT-175B can’t be reproduced at Harvard or MIT. It probably can’t even be reproduced by Google and OpenAI, even though they have sufficient computing resources. I would bet that OPT-175B is too closely tied to Facebook’s infrastructure (including custom hardware) to be reproduced on Google’s infrastructure. I would bet the same is true of LaMDA, GPT-3, and other very large models, if you take them out of the environment in which they were built.  If Google released the source code to LaMDA, Facebook would have trouble running it on its infrastructure. The same is true for GPT-3. 

So: what can “reproducibility” mean in a world where the infrastructure needed to reproduce important experiments can’t be reproduced?  The answer is to provide free access to outside researchers and early adopters, so they can ask their own questions and see the wide range of results. Because these models can only run on the infrastructure where they’re built, this access will have to be via public APIs.

There are lots of impressive examples of text produced by large language models. LaMDA’s are the best I’ve seen. But we also know that, for the most part, these examples are heavily cherry-picked. And there are many examples of failures, which are certainly also cherry-picked.  I’d argue that, if we want to build safe, usable systems, paying attention to the failures (cherry-picked or not) is more important than applauding the successes. Whether it’s sentient or not, we care more about a self-driving car crashing than about it navigating the streets of San Francisco safely at rush hour. That’s not just our (sentient) propensity for drama;  if you’re involved in the accident, one crash can ruin your day. If a natural language model has been trained not to produce racist output (and that’s still very much a research topic), its failures are more important than its successes. 

With that in mind, OpenAI has done well by allowing others to use GPT-3–initially, through a limited free trial program, and now, as a commercial product that customers access through APIs. While we may be legitimately concerned by GPT-3’s ability to generate pitches for conspiracy theories (or just plain marketing), at least we know those risks.  For all the useful output that GPT-3 creates (whether deceptive or not), we’ve also seen its errors. Nobody’s claiming that GPT-3 is sentient; we understand that its output is a function of its input, and that if you steer it in a certain direction, that’s the direction it takes. When GitHub Copilot (built from OpenAI Codex, which itself is built from GPT-3) was first released, I saw lots of speculation that it will cause programmers to lose their jobs. Now that we’ve seen Copilot, we understand that it’s a useful tool within its limitations, and discussions of job loss have dried up. 

Google hasn’t offered that kind of visibility for LaMDA. It’s irrelevant whether they’re concerned about intellectual property, liability for misuse, or inflaming public fear of AI. Without public experimentation with LaMDA, our attitudes towards its output–whether fearful or ecstatic–are based at least as much on fantasy as on reality. Whether or not we put appropriate safeguards in place, research done in the open, and the ability to play with (and even build products from) systems like GPT-3, have made us aware of the consequences of “deep fakes.” Those are realistic fears and concerns. With LaMDA, we can’t have realistic fears and concerns. We can only have imaginary ones–which are inevitably worse. In an area where reproducibility and experimentation are limited, allowing outsiders to experiment may be the best we can do. 

Categories: Technology

Topic for June 9th

PLUG - Thu, 2022/06/09 - 17:55

This is a remote meeting. Please join by going to at 7pm on Thursday June 9th.

Brian Peters: Virtual Data Optimizer (VDO) - Data Reduction for Block Storage

Introduction to Virtual Data Optimizer (VDO), an advanced storage technology for maximizing drive space. In this presentation we'll discuss use cases for VDO, advantages & disadvantages, and demo configuring & testing a drive using Virtual Data Optimizer.

About Brian:
Brian Peters, has been interested in technology since childhood. His first PC was a 486 clone that was upgraded many times over. His interest for Linux started with Ubuntu 5.10 (Breezy Badger), but has since found home with Debian. Brian is RHCSA certified and enjoys sharing his passion for FOSS with others.

Closer to AGI?

O'Reilly Radar - Tue, 2022/06/07 - 04:09

DeepMind’s new model, Gato, has sparked a debate on whether artificial general intelligence (AGI) is nearer–almost at hand–just a matter of scale.  Gato is a model that can solve multiple unrelated problems: it can play a large number of different games, label images, chat, operate a robot, and more.  Not so many years ago, one problem with AI was that AI systems were only good at one thing. After IBM’s Deep Blue defeated Garry Kasparov in chess,  it was easy to say “But the ability to play chess isn’t really what we mean by intelligence.” A model that plays chess can’t also play space wars. That’s obviously no longer true; we can now have models capable of doing many different things. 600 things, in fact, and future models will no doubt do more.

So, are we on the verge of artificial general intelligence, as Nando de Frietas (research director at DeepMind) claims? That the only problem left is scale? I don’t think so.  It seems inappropriate to be talking about AGI when we don’t really have a good definition of “intelligence.” If we had AGI, how would we know it? We have a lot of vague notions about the Turing test, but in the final analysis, Turing wasn’t offering a definition of machine intelligence; he was probing the question of what human intelligence means.

Consciousness and intelligence seem to require some sort of agency.  An AI can’t choose what it wants to learn, neither can it say “I don’t want to play Go, I’d rather play Chess.” Now that we have computers that can do both, can they “want” to play one game or the other? One reason we know our children (and, for that matter, our pets) are intelligent and not just automatons is that they’re capable of disobeying. A child can refuse to do homework; a dog can refuse to sit. And that refusal is as important to intelligence as the ability to solve differential equations, or to play chess. Indeed, the path towards artificial intelligence is as much about teaching us what intelligence isn’t (as Turing knew) as it is about building an AGI.

Even if we accept that Gato is a huge step on the path towards AGI, and that scaling is the only problem that’s left, it is more than a bit problematic to think that scaling is a problem that’s easily solved. We don’t know how much power it took to train Gato, but GPT-3 required about 1.3 Gigawatt-hours: roughly 1/1000th the energy it takes to run the Large Hadron Collider for a year. Granted, Gato is much smaller than GPT-3, though it doesn’t work as well; Gato’s performance is generally inferior to that of single-function models. And granted, a lot can be done to optimize training (and DeepMind has done a lot of work on models that require less energy). But Gato has just over 600 capabilities, focusing on natural language processing, image classification, and game playing. These are only a few of many tasks an AGI will need to perform. How many tasks would a machine be able to perform to qualify as a “general intelligence”? Thousands?  Millions? Can those tasks even be enumerated? At some point, the project of training an artificial general intelligence sounds like something from Douglas Adams’ novel The Hitchhiker’s Guide to the Galaxy, in which the Earth is a computer designed by an AI called Deep Thought to answer the question “What is the question to which 42 is the answer?”

Building bigger and bigger models in hope of somehow achieving general intelligence may be an interesting research project, but AI may already have achieved a level of performance that suggests specialized training on top of existing foundation models will reap far more short term benefits. A foundation model trained to recognize images can be trained further to be part of a self-driving car, or to create generative art. A foundation model like GPT-3 trained to understand and speak human language can be trained more deeply to write computer code.

Yann LeCun posted a Twitter thread about general intelligence (consolidated on Facebook) stating some “simple facts.” First, LeCun says that there is no such thing as “general intelligence.” LeCun also says that “human level AI” is a useful goal–acknowledging that human intelligence itself is something less than the type of general intelligence sought for AI. All humans are specialized to some extent. I’m human; I’m arguably intelligent; I can play Chess and Go, but not Xiangqi (often called Chinese Chess) or Golf. I could presumably learn to play other games, but I don’t have to learn them all. I can also play the piano, but not the violin. I can speak a few languages. Some humans can speak dozens, but none of them speak every language.

There’s an important point about expertise hidden in here: we expect our AGIs to be “experts” (to beat top-level Chess and Go players), but as a human, I’m only fair at chess and poor at Go. Does human intelligence require expertise? (Hint: re-read Turing’s original paper about the Imitation Game, and check the computer’s answers.) And if so, what kind of expertise? Humans are capable of broad but limited expertise in many areas, combined with deep expertise in a small number of areas. So this argument is really about terminology: could Gato be a step towards human-level intelligence (limited expertise for a large number of tasks), but not general intelligence?

LeCun agrees that we are missing some “fundamental concepts,” and we don’t yet know what those fundamental concepts are. In short, we can’t adequately define intelligence. More specifically, though, he mentions that “a few others believe that symbol-based manipulation is necessary.” That’s an allusion to the debate (sometimes on Twitter) between LeCun and Gary Marcus, who has argued many times that combining deep learning with symbolic reasoning is the only way for AI to progress. (In his response to the Gato announcement, Marcus labels this school of thought “Alt-intelligence.”) That’s an important point: impressive as models like GPT-3 and GLaM are, they make a lot of mistakes. Sometimes those are simple mistakes of fact, such as when GPT-3 wrote an article about the United Methodist Church that got a number of basic facts wrong. Sometimes, the mistakes reveal a horrifying (or hilarious, they’re often the same) lack of what we call “common sense.” Would you sell your children for refusing to do their homework? (To give GPT-3 credit, it points out that selling your children is illegal in most countries, and that there are better forms of discipline.)

It’s not clear, at least to me, that these problems can be solved by “scale.” How much more text would you need to know that humans don’t, normally, sell their children? I can imagine “selling children” showing up in sarcastic or frustrated remarks by parents, along with texts discussing slavery. I suspect there are few texts out there that actually state that selling your children is a bad idea. Likewise, how much more text would you need to know that Methodist general conferences take place every four years, not annually? The general conference in question generated some press coverage, but not a lot; it’s reasonable to assume that GPT-3 had most of the facts that were available. What additional data would a large language model need to avoid making these mistakes? Minutes from prior conferences, documents about Methodist rules and procedures, and a few other things. As modern datasets go, it’s probably not very large; a few gigabytes, at most. But then the question becomes “How many specialized datasets would we need to train a general intelligence so that it’s accurate on any conceivable topic?”  Is that answer a million?  A billion?  What are all the things we might want to know about? Even if any single dataset is relatively small, we’ll soon find ourselves building the successor to Douglas Adams’ Deep Thought.

Scale isn’t going to help. But in that problem is, I think, a solution. If I were to build an artificial therapist bot, would I want a general language model?  Or would I want a language model that had some broad knowledge, but has received some special training to give it deep expertise in psychotherapy? Similarly, if I want a system that writes news articles about religious institutions, do I want a fully general intelligence? Or would it be preferable to train a general model with data specific to religious institutions? The latter seems preferable–and it’s certainly more similar to real-world human intelligence, which is broad, but with areas of deep specialization. Building such an intelligence is a problem we’re already on the road to solving, by using large “foundation models” with additional training to customize them for special purposes. GitHub’s Copilot is one such model; O’Reilly Answers is another.

If a “general AI” is no more than “a model that can do lots of different things,” do we really need it, or is it just an academic curiosity?  What’s clear is that we need better models for specific tasks. If the way forward is to build specialized models on top of foundation models, and if this process generalizes from language models like GPT-3 and O’Reilly Answers to other models for different kinds of tasks, then we have a different set of questions to answer. First, rather than trying to build a general intelligence by making an even bigger model, we should ask whether we can build a good foundation model that’s smaller, cheaper, and more easily distributed, perhaps as open source. Google has done some excellent work at reducing power consumption, though it remains huge, and Facebook has released their OPT model with an open source license. Does a foundation model actually require anything more than the ability to parse and create sentences that are grammatically correct and stylistically reasonable?  Second, we need to know how to specialize these models effectively.  We can obviously do that now, but I suspect that training these subsidiary models can be optimized. These specialized models might also incorporate symbolic manipulation, as Marcus suggests; for two of our examples, psychotherapy and religious institutions, symbolic manipulation would probably be essential. If we’re going to build an AI-driven therapy bot, I’d rather have a bot that can do that one thing well than a bot that makes mistakes that are much subtler than telling patients to commit suicide. I’d rather have a bot that can collaborate intelligently with humans than one that needs to be watched constantly to ensure that it doesn’t make any egregious mistakes.

We need the ability to combine models that perform different tasks, and we need the ability to interrogate those models about the results. For example, I can see the value of a chess model that included (or was integrated with) a language model that would enable it to answer questions like “What is the significance of Black’s 13th move in the 4th game of FischerFisher vs. Spassky?” Or “You’ve suggested Qc5, but what are the alternatives, and why didn’t you choose them?” Answering those questions doesn’t require a model with 600 different abilities. It requires two abilities: chess and language. Moreover, it requires the ability to explain why the AI rejected certain alternatives in its decision-making process. As far as I know, little has been done on this latter question, though the ability to expose other alternatives could be important in applications like medical diagnosis. “What solutions did you reject, and why did you reject them?” seems like important information we should be able to get from an AI, whether or not it’s “general.”

An AI that can answer those questions seems more relevant than an AI that can simply do a lot of different things.

Optimizing the specialization process is crucial because we’ve turned a technology question into an economic question. How many specialized models, like Copilot or O’Reilly Answers, can the world support? We’re no longer talking about a massive AGI that takes terawatt-hours to train, but about specialized training for a huge number of smaller models. A psychotherapy bot might be able to pay for itself–even though it would need the ability to retrain itself on current events, for example, to deal with patients who are anxious about, say, the invasion of Ukraine. (There is ongoing research on models that can incorporate new information as needed.) It’s not clear that a specialized bot for producing news articles about religious institutions would be economically viable. That’s the third question we need to answer about the future of AI: what kinds of economic models will work? Since AI models are essentially cobbling together answers from other sources that have their own licenses and business models, how will our future agents compensate the sources from which their content is derived? How should these models deal with issues like attribution and license compliance?

Finally, projects like Gato don’t help us understand how AI systems should collaborate with humans. Rather than just building bigger models, researchers and entrepreneurs need to be exploring different kinds of interaction between humans and AI. That question is out of scope for Gato, but it is something we need to address regardless of whether the future of artificial intelligence is general or narrow but deep. Most of our current AI systems are oracles: you give them a prompt, they produce an output.  Correct or incorrect, you get what you get, take it or leave it. Oracle interactions don’t take advantage of human expertise, and risk wasting human time on “obvious” answers, where the human says “I already know that; I don’t need an AI to tell me.”

There are some exceptions to the oracle model. Copilot places its suggestion in your code editor, and changes you make can be fed back into the engine to improve future suggestions. Midjourney, a platform for AI-generated art that is currently in closed beta, also incorporates a feedback loop.

In the next few years, we will inevitably rely more and more on machine learning and artificial intelligence. If that interaction is going to be productive, we will need a lot from AI. We will need interactions between humans and machines, a better understanding of how to train specialized models, the ability to distinguish between correlations and facts–and that’s only a start. Products like Copilot and O’Reilly Answers give a glimpse of what’s possible, but they’re only the first steps. AI has made dramatic progress in the last decade, but we won’t get the products we want and need merely by scaling. We need to learn to think differently.

Categories: Technology

Radar Trends to Watch: June 2022

O'Reilly Radar - Wed, 2022/06/01 - 04:54

The explosion of large models continues. Several developments are especially noteworthy. DeepMind’s Gato model is unique in that it’s a single model that’s trained for over 600 different tasks; whether or not it’s a step towards general intelligence (the ensuing debate may be more important than the model itself), it’s an impressive achievement. Google Brain’s Imagen creates photorealistic images that are impressive, even after you’ve seen what DALL-E 2 can do. And Allen AI’s Macaw (surely an allusion to Emily Bender and Timnit Gebru’s Stochastic Parrots paper) is open source, one tenth the size of GPT-3, and claims to be more accurate. Facebook/Meta is also releasing an open source large language model, including the model’s training log, which records in detail the work required to train it.

Artificial Intelligence
  • Is thinking of autonomous vehicles as AI systems rather than as robots the next step forward? A new wave of startups is trying techniques such as reinforcement learning to train AVs to drive safely.
  • Generative Flow Networks may be the next major step in building better AI systems.
  • The ethics of building AI bots that mimic real dead people seems like an academic question, until someone does it: using GPT-3, a developer created a bot based on his deceased fiancée. OpenAI objected, stating that building such a bot was a violation of its terms of service.
  • Cortical Labs and other startups are building computers that incorporate human neurons. It’s claimed that these systems can be trained to perform game-playing tasks significantly faster than traditional AI.
  • Google Brain has built a new text-to-image generator called Imagen that creates photorealistic images. Although images generated by projects like this are always cherry-picked, the image quality is impressive; the developers claim that it is better than DALL-E 2.
  • DeepMind has created a new “generalist” model called Gato. It is a single model that can solve many different kinds of tasks: playing multiple games, labeling images, and so on. It has prompted a debate on whether Artificial General Intelligence is simply a matter of scale.
  • AI in autonomous vehicles can be used to eliminate waiting at traffic lights, increase travel speed, and reduce fuel consumption and carbon emissions. Surprisingly, if only 25% of the vehicles are autonomous, you get 50% of the benefit.
  • Macaw is a language model developed by Allen AI (AI2). It is freely available and open-source. Macaw is 1/10th the size of GPT-3 and roughly 10% more accurate at answering questions, though (like GPT-3) it tends to fail at questions that require common sense or involve logical tricks.
  • Ai-da is an AI-driven robot that can paint portraits–but is it art? Art is as much about human perception as it is about creation. What social cues prompt us to think that a robot is being creative?
  • Facebook/Meta has created a large language model called OPT that is similar in size and performance to GPT-3. Using the model is free for non-commercial work; the code is being released open source, along with documents describing how the model was trained.
  • Alice is a modular and extensible open source virtual assistant (think Alexa) that can run completely offline. It is private by default, though it can be configured to use Amazon or Google as backups. Alice can identify different users (for whom it can develop “likes” or “dislikes,” based on interactions).
  • High volume event streaming without a message queue: Palo Alto Networks has built a system for processing terabytes of security events per day without using a message queue, just a NoSQL database.
  • New tools allow workflow management across groups of spreadsheets. Spreadsheets are the original “low code”; these tools seem to offer spreadsheet users many of the features that software developers get from tools like git.
  • Portainer is a container management tool that lets you mount Docker containers as persistent filesystems.
  • NVIDIA has open-sourced its Linux device drivers. The code is available on GitHub. This is a significant change for a company that historically has avoided open source.
  • A startup named Buoyant is building tools to automate management of Linkerd. Linkerd, in turn, is a service mesh that is easier to manage and more appropriate for small to medium businesses, than Istio.
  • Are we entering the “third age of JavaScript”? An intriguing article suggests that we are. In this view of the future, static site generation disappears, incremental rendering and edge routing become more important, and Next.js becomes a dominant platform.
  • Rowy is a low-code programming environment that intends to escape the limitations of Airtable and other low-code collaboration services. The interface is like a spreadsheet, but it’s built on top of the Google Cloud Firestore document database.
  • PyScript is framework for running Python in the browser, mixed with HTML (in some ways, not unlike PHP). It is based on Pyodide (a WASM implementation of Python), integrates well with JavaScript, and might support other languages in the future.
  • Machine learning raises the possibility of undetectable backdoor attacks, malicious attacks that can affect the output of a model but don’t measurably detect its performance. Security issues for machine learning aren’t well understood, and aren’t getting a lot of attention.
  • In a new supply chain attack, two widely used libraries (Python’s ctx and PHP’s PHPass) have been compromised to steal AWS credentials. The attacker now claims that these exploits were “ethical research,” possibly with the goal of winning bounties for reporting exploits.
  • While it is not yet accurate enough to work in practice, a new method for detecting cyber attacks can detect and stop attacks in under one second.
  • The Eternity Project is a new malware-as-a-service organization that offers many different kinds of tools for data theft, ransomware, and many other exploits. It’s possible that the project is itself a scam, but it appears to be genuine.
  • Palo Alto Networks has published a study showing that most cloud identity and access management policies are too permissive, and that 90% of the permissions granted are never used. Overly-permissive policies are a major vulnerability for cloud users.
  • NIST has just published a massive guide to supply chain security. For organizations that can’t digest this 326-page document, they plan to publish a quick-start guide.
  • The Passkey standard, supported by Google, Apple, and Microsoft, replaces passwords with other forms of authentication. An application makes an authentication request to the device, which can then respond using any authentication method it supports. Passkey is operating system-independent, and supports both Bluetooth in addition to Internet protocols.
  • Google and Mandiant both report significant year-over-year increases in the number of 0-day vulnerabilities discovered in 2021.
  • Interesting statistics about ransomware attacks: The ransom is usually only 15% of the total cost of the attack; and on average, the ransom is 2.8% of net revenue (with discounts of up to 25% for prompt payment).
  • Bugs in the most widely used ransomware software, including REvil and Conti, can be used to prevent the attacker from encrypting your data.
Web and Web3 VR/AR/Metaverse
  • Niantic is building VPS (Visual Positioning System), an augmented reality map of the world, as part of its Lightship platform. VPS allows games and other AR products to be grounded to the physical world.
  • LivingCities is building a digital twin of the real world as a platform for experiencing the world in extended reality. That experience includes history, a place’s textures and feelings, and, of course, a new kind of social media.
  • New research in haptics allows the creation of realistic virtual textures by measuring how people feel things. Humans are extremely sensitive to the textures of materials, so creating good textures is important for everything from video games to telesurgery.
  • Google is upgrading its search engine for augmented reality: they are integrating images more fully into searches, creating multi-modal searches that incorporate images, text, and audio, and generating search results that can be explored through AR.
  • BabylonJS is an open source 3D engine, based on WebGL and WebGPU, that Microsoft developed. It is a strong hint that Microsoft’s version of the Metaverse will be web-based. It will support WebXR.
  • The fediverse is an ensemble of microblogging social media sites (such as Mastodon) that communicate with each other. Will they become a viable alternative to Elon Musk’s Twitter?
  • Varjo is building a “reality cloud”: a 3D mixed reality streaming service that allows photorealistic “virtual teleportation.” It’s not about weird avatars in a fake 3D world; they record your actions in your actual environment.
Hardware Design
  • Ethical design starts with a redefinition of success: well-being, equity, and sustainability, with good metrics for measuring your progress.
Quantum Computing
  • QICK is a new standardized control plane for quantum devices. The design of the control plane, including software, is all open source. A large part of the cost of building a quantum device is building the electronics to control it. QICK will greatly reduce the cost of quantum experimentation.
  • Researchers have built logical gates using error-corrected quantum bits. This is a significant step towards building a useful quantum computer.
Categories: Technology

Building a Better Middleman

O'Reilly Radar - Tue, 2022/05/17 - 03:58

In the previous article, I explored the role of the middleman in a two-sided marketplace.  The term “middleman” has a stigma to it. Mostly because, when you sit between two parties that want to interact, it’s easy to get greedy.

Greed will bring you profits in the short term. Probably in the long term, as well.  As a middleman, though, your greed is an existential threat.  When you abuse your position and mistreat the parties you connect–when your cost outweighs your value–they’ll find a way to replace you. Maybe not today, maybe not tomorrow, but it will happen.

Luckily, you can make money as a middleman and still keep everyone happy.  Here’s how to create that win-win-win triangle:

Keep refining your platform

Running a marketplace is a game of continuous improvement. You need to keep asking yourself: how can I make this better for the people who interact through the marketplace?

To start, you can look for ways to make your platform more attractive to existing customers. I emphasize both customers, not just one side of the marketplace. Mistreating one side to favor the other may work for a time, but it will eventually fall through. Frustration has a way of helping people overcome switching costs.

Some stock exchanges designate market makers (“specialists,” if you’re old-school), firms that are always ready to both buy and sell shares of a given stock. If I want to offload a thousand shares and there’s no one who wants to buy them from me, the market maker steps in to play the role of the buyer. By guaranteeing that there will always be someone on the other side of the bid or ask, exchanges keep everyone happy.

If you constantly review how the two parties interact, you can look for opportunities to mitigate their risk, create new services, or otherwise reduce friction. Most platforms connect strangers, right?  So if you look at your business through the lens of safety, you’ll find a lot of work to do. Note how eBay’s review system provides extra assurance for buyers and sellers to trade with people they’ve never met.  Similarly, in the early days of online commerce, credit card issuers limited shoppers’ fraud risk to just $50 per purchase.  This improved consumers’ trust in online shopping, which helped make e-commerce the everyday norm that it is today.

Safety improvements also extend to communications. Do the parties really need to swap e-mail addresses or phone numbers?  If they’re just confirming a rideshare pickup or flirting through a dating app, probably not.  As a middleman, you are perfectly positioned to serve as the conduit;  one that provides an appropriate level of masking or pseudonymity.  And the money you invest in deploying a custom messaging system or temporary phone numbers (Twilio, anyone?) will pay off in terms of improved adoption and retention.

Design new products and services

If you understand how your parties interact and what they want to achieve, you’re in a position to spot new product opportunities that will make your customers happy.

From a conversation with Cyril Nigg, Director of Analytics at Reverb, the music-gear marketplace was “founded by music makers, for music makers.”  Musicians like to try new gear, but they want to offload it if it doesn’t pan out. Reverb has therefore built tools around pricing assistance to help musicians with their product listings: You want to sell this distortion pedal within 7 days? List it as $X. This extra assurance that they’ll be able to resell a piece of equipment, in short order, reduces apprehensions about buying. (Going back to the point about keeping both sides of the marketplace happy: Cyril also pointed out that a Reverb customer may act as both buyer and seller across different transactions.  That means the company can’t skimp on one side of the experience.)

People on a dating site want to communicate, so an easy win there is to keep an eye on new communications tools. Maybe your platform started out with an asynchronous, text-based tool that resembled e-mail.  Can you add an option for real-time chat?  What would it take to move up to voice? And ultimately, video? Each step in the progression requires advances in technology, so you may have to wait before you can actually deploy something. But if you can envision the system you want, you can keep an eye on the tech and be poised to pounce when it is generally available.

Unlike dating sites, financial exchanges are marketplaces for opposing views. One person thinks that some event will happen, they seek a counterpart who thinks that it will not, and fate determines the winner.  This can be as vanilla as people buying or selling shares of stock, where the counterparties believe the share price will rise or fall, respectively.  You also see situations that call for more exotic tools.  In the lead-up to what would become the 2008 financial crisis, investors wanted to stake claims around mortgage-backed securities but there wasn’t a way to express the belief that those prices would fall. In response to this desire, a group of banks dusted off the credit default swap (CDS) concept and devised a standard, easily-tradable contract.  Now there was a way for people to take either side of the trade, and for the banks to collect fees in the middle.  A win-win-win situation.

(Well, the actual trade was a win-win-win. The long-term outcome was more of a lose-lose-win. Mortgage defaults rose, sending prices for the associated mortgage-backed securities into decline, leading to big payouts for the “I told you this was going to happen” side of each CDS contract. The banks that served double-duty as both market participant and middleman took on sizable losses as a result. Let this be a lesson to you: part of why a middleman makes money is precisely because they have no stake in the long-term outcome of putting the parties together. Stay in the middle if you want to play it safe.)

Granted, you don’t have to roll out every possible product or feature on your first day. You have to let the marketplace grow and mature somewhat, to see what will actually be useful. Still, you want to plan ahead. As you watch the marketplace, you will spot opportunities well in advance, so you can position yourself to implement them before the need is urgent.

Focus on your business

Besides making things easier for customers, being a better middleman means improving how your business runs.

To start, identify and eliminate inefficiencies in your operations. I don’t mean that you should cut corners, as that will come back to bite you later.  I mean that you can check for genuine money leaks. The easy candidates will be right there on your balance sheet: have you actually used Service ABC in the last year?  If not, maybe it’s time to cut it. Is there an equivalent to Service XYZ at a lower price? Once you’ve confirmed that the cheaper service is indeed a suitable replacement, it’s time to make the switch.

A more subtle candidate is your codebase. Custom code is a weird form of debt. It requires steady, ongoing maintenance just like payments in a loan. It may also require disruptive changes if you encounter a bug. (Imagine that your mortgage lender occasionally demanded a surprise lump sum in mid-month.) Can you replace that home-grown system with an off-the-shelf tool or a third-party service, for a cheaper and more predictable payment schedule?

You also want to check on the size of your total addressable market (TAM).  What happens when you’ve reached everyone who will ever join? It’s emotionally reassuring to tell yourself that the entire planet will use your service, sure. But do you really want to base revenue projections on customers you can’t realistically acquire or retain? At some point, your customer numbers will plateau (and, after that, sink). You need to have a difficult conversation with yourself, your leadership team, and your investors around how you’ll handle that. And you need to have that conversation well in advance. Once you hit that limit on your TAM, you’ll need to be ready to deliver improvements that reduce churn. Perhaps you can offer new services, which may extend your addressable market into new territory, but even that has its limits.

What are you doing for risk management? A risk represents a possible future entry on your balance sheet, one of indeterminate size. Maybe it’s a code bug that spirals out of control under an edge case. Or a lingering complaint that blossoms into a full-scale PR issue. To be blunt: good risk management will save you money. Possibly lots of money. While it’s tempting to let some potential problems linger, understand that it’s easier and cheaper to address them early and on your own schedule. That’s much nicer than being under pressure to fix a surprise in real-time.

Sharp-eyed readers will catch that subtle tradeoff between “addressing inefficiencies” and “proactively mitigating risks.” Risk management often requires that you leave extra slack in the system, such as higher staff headcount, or extra machines that mostly sit idle. This slack serves as a cushion in the event of a surge in customer activity but it also costs money.  There’s no easy answer here. It’s a blend of art and science to spot the difference between slack and waste.

Most of all, as a marketplace, you want to mature with your customers and the field overall. The term “innovate” gets some much-deserved flack, but it’s not complete hogwash. Be prepared to invest in research so you can see what changes are on the horizon, and then adapt accordingly. Also, keep an eye on the new features your customers are asking for, or the complaints they raise about your service. You’ll  otherwise fall into the very trap described in The Innovator’s Dilemma. Don’t become the slow-moving, inattentive behemoth that some nimble upstart will work to unseat.

Use technology as a force multiplier

Bad middlemen squeeze the parties they connect; good middlemen squeeze technology.

Done well, technology is a source of asymmetric advantage. Putting code in the right places allows you to accomplish more work, more consistently, with fewer people, and in less time. All of the efficiencies you get through code will leave more money to split between yourself and your customers.  That is a solid retention strategy.

To start, you can apply software to real and artificial scarcity that exists in other middlemen. A greenfield operation can start with lower headcount, less (or zero!) office space, and so on.

Tech staffing, for example, is a matching problem at its core. A smart staffing firm would start with self-service search tools so a company could easily find people to match their open roles. No need to interact with a human recruiter. It could also standardize contract language to reduce legal overhead (no one wants a thousand slightly-different contracts laying around, anyway) and use electronic signatures to make it easier to store paperwork for future reference.

You don’t even have to do anything fancy. Sometimes, the very act of putting something online is a huge step up from the incumbent solution. Craigslist, simply by running classified ads on a website, gave people a much-improved experience over the print-newspaper version. People had more space to write (goodbye, obscure acronyms), had search functionality (why skim all the listings to find what you’re after?), and could pull their ad when it had been resolved (no more getting phone calls for an extra week just because the print ad is still visible).

Technology also makes it easier to manage resources. Love or loathe them, rideshare companies like Lyft and Uber can scale to a greater number of drivers and riders than the old-school taxi companies that rely on radio dispatch and flag-pulls. And they can do it with less friction. Why call a company and tell them your pickup location, when an app can use your phone’s GPS? And why should that dispatcher have to radio around in search of a driver? To arrange a ride, you need to match three elements–pickup location, dropoff location, and number of passengers–to an available driver. This is a trivial effort for a computer. Throw in mobile apps for drivers and passengers, and you have a system that can scale very well.

(Some may argue that the rideshare companies get extra scale because their drivers are classified as independent contractors, and because they don’t require expensive taxi medallions. I don’t disagree. I just want to point out that the companies’ technology is also a strong enabler.)

Being at the center of the marketplace means you get to see the entire system at once. You can analyze the data around customer activity, and pass on insights to market participants to make their lives easier. Airbnb, for example, has deep insight into how different properties perform. Their research team determined that listings with high-quality photos tend to earn more revenue. They publicized this information to help hosts and, to sweeten the deal, the company then built a service to connect hosts with professional photographers.

What about ML/AI? While I hardly believe that it’s ready to eat every job, I do see opportunities for AI to make a smaller team of people more effective. ML models are well-suited for decisions that are too fuzzy or cumbersome to be expressed as hard rules in software, but not so nuanced that they require human judgment. Putting AI in the seat for those decisions frees up your team for things that genuinely merit a human’s eyes and expertise.

I’ve argued before that a lot of machine learning is high-powered matching. What is “classification,” if not rating one item’s similarity to an archetype?  A marketplace that deals in the long tail of goods can use ML to help with that matching.

Take Reverb, where most pieces of gear are unique but still similar to other items. They’re neither completely fungible, nor completely non-fungible.  They’re sort of semi-fungible. To simplify search, then, Director of Analytics Cyril Nigg says that the company groups related items into ML-based canonical products (where some specific Product X is really part of a wider Canonical Product Y). “[We use] ML to match listings to a product–say, matching on title, price point, or some other attribute. This tells us, with a high degree of confidence, that a seller’s used Fender guitar is actually an American Standard Stratocaster. Now that we know the make and model, a buyer can easily compare all the different listings within that product to help them find the best option. This ML system learns over time, so that a seller can upload a listing and the system can file it under the proper canonical product.”

Machine-based matching works for food as well as guitars. Resham Sarkar heads up data science at Slice, which gives local pizzerias the tools, technology and guidance they need to thrive. In a 2021 interview, she told me how her team applies ML to answer the age-old question: will Person X enjoy Pizza Y at Restaurant Z? Slice’s recommendations give eaters the confidence to try a new flavor in a new location, which helps them (maybe they’ll develop a new favorite) and also helps pizzerias (they get new customers). This is especially useful when a pizza lover lands in a new city and doesn’t know where to get their fix.

Any discussion of technology wouldn’t be complete without a nod to emerging tech. Yes, keeping up with the Shiny New Thing of the Moment means having to wade through plenty of hype. But if you look closely, you may also find some real game-changers for your business. This was certainly true of the 1990s internet boom. We’ve seen it in the past decade of what we now call AI, across all of its rebrandings. And yes, I expect that blockchain technologies will prove more useful than the curmudgeons want to let on.  (Even NFTs. Or, especially NFTs.)

Skip past the success stories and vendor pitches, though. Do your own homework on what the new technology really is and what it can do. Then, engage an expert to help you fill in the gaps and sort out what is possible with your business. The way a new technology addresses your challenges may not align with whatever is being hyped in the news, but who cares? All that matters is that it drives improvements for your use cases.

Watch your tech

Technology is a double-edged sword. It’s like using leverage in the stock market: employing software or AI exposes you to higher highs when things go right, but also lower lows when things unravel.

One benefit to employing people to perform a task is that they can notice when something is wrong and then stop working. A piece of code, by comparison, has no idea that it is operating out of its depth. The same tools that let you do so much more, with far fewer people, also expose you to a sizable risk: one bug or environmental disconnect can trigger a series of errors, at machine speeds, cascading into a massive failure.

All it takes is for a few smaller problems to collide. Consider the case of Knight Capital. This experienced, heavyweight market-maker once managed $21BN in daily transaction volume on the NYSE. One day in 2012, an inconsistent software deployment met a branch of old code, which in turn collided with a new order type on the exchange. This led to a meltdown in which Knight Capital lost $440M in under an hour.

The lesson here is that some of the money you save from reduced headcount should be reinvested in the company in the form of people and tools to keep an eye on the larger system. You’ll want to separate responsibilities in order to provide checks and balances, such as assigning someone who is not a developer to manage and review code deployments. Install monitors that provide fine-grained information about the state of your systems. Borrowing a line from a colleague: you can almost never have too many dimensions of data when troubleshooting.

You’ll also need people to step in when someone gets caught in your web of automation. Have you ever called a company’s customer service line, only to wind up in a phone-tree dead-end? That can be very frustrating. You don’t want that for your customers, so you need to build escape hatches that route them to a person. That holds for your AI-driven chatbot as much as your self-help customer service workflows. And especially for any place where people can report a bug or an emergency situation.

Most of all, this level of automation requires a high-caliber team. Don’t skimp on hiring. Pay a premium for very experienced people to build and manage your technology. If you can, hire someone who has built trading systems on Wall St. That culture is wired to identify and handle risk in complex, automated systems where there is a lot of real money at stake.  And they have seen technology fail in ways that you cannot imagine.

Markets, everywhere

I’ve often said that problems in technology are rarely tech-related; they’re people-related. The same holds for building a marketplace, where the big problem is really human greed.

Don’t fall for the greed trap. You can certainly run the business in a way that brings you revenue, keeps customers happy, and attracts new prospects. Identify inefficiencies in your business operations, and keep thinking of ways to make the platform better for your customers. That’s it.  A proper application of software and AI, risk management, and research into emerging technologies should help you with both. And the money you save, you can split with your user base.

If you’re willing to blur the lines a little, you will probably find markets in not-so-obvious places. An airline sits between passengers and destinations. Grocery stores sit between shoppers and suppliers. Employers sit between employees and clients. And so on. Once you find the right angle, you can borrow ideas from the established, well-run middlemen to improve your business.

(Many thanks to Chris Butler for his thoughtful and insightful feedback on early drafts of this article.)

Categories: Technology

Quantum Computing without the Hype

O'Reilly Radar - Tue, 2022/05/10 - 04:45

Several weeks ago, I had a great conversation with Sebastian Hassinger about the state of quantum computing. It’s exciting–but also, not what a lot of people are expecting.

I’ve seen articles in the trade press telling people to invest in quantum computing now or they’ll be hopelessly behind. That’s silly. There are too many people in the world who think that a quantum computer is just a fast mainframe. It isn’t; quantum programming is completely different, and right now, the number of algorithms we know that will work on quantum computers is very small. You can count them on your fingers and toes. While it’s probably important to prepare for quantum computers that can decrypt current cryptographic codes, those computers won’t be around for 10-20 years. While there is still debate on how many physical qubits will be needed for error correction, and even on the meaning of a “logical” (error-corrected) qubit, the most common  estimates are that it will require on the order of 1,000 error corrected qubits to break current encryption systems, and that it will take 1,000 physical qubits to make one error corrected qubit. So we’ll need an order of 1 million qubits, and current quantum computers are all in the area of 100 qubits. Figuring out how to scale our current quantum computers by 5 orders of magnitude may well be the biggest problem facing researchers, and there’s no solution in sight.

So what can quantum computers do now that’s interesting? First, they are excellent tools for simulating quantum behavior: the behavior of subatomic particles and atoms that make up everything from semiconductors to bridges to proteins. Most, if not all, modeling in these areas is based on numerical methods–and modern digital computers are great at that. But it’s time to think again about non-numerical methods: can a quantum computer simulate directly what happens when two atoms interact? Can it figure out what kind of molecules will be formed, and what their shapes will be? This is the next step forward in quantum computing, and while it’s still research, It’s a significant way forward. We live in a quantum world. We can’t observe quantum behavior directly, but it’s what makes your laptop work and your bridges stay up. If we can model that behavior directly with quantum computers, rather than through numeric analysis, we’ll make a huge step forward towards finding new kinds of materials, new treatments for disease, and more. In a way, it’s like the difference between analog and digital computers. Any engineer knows that digital computers spend a lot of time finding approximate numeric solutions to complicated differential equations. But until digital computers got sufficiently large and fast, the behavior of those systems could be modeled directly on analog computers. Perhaps the earliest known examples of analog computers are Stonehenge and the Antikythera mechanism, both of which were used to predict astronomical positions. Thousands of years before digital computers existed, these analog computers modeled the behavior of the cosmos, solving equations that their makers couldn’t have understood–and that we now solve numerically on digital computers.

Recently, researchers have developed a standardized control plane that should be able to work with all kinds of quantum devices. The design of the control plane, including software, is all open source. This should greatly decrease the cost of experimentation, allowing researchers to focus on the quantum devices themselves, instead of designing the circuitry needed to manage the qubits.  It’s not unlike the dashboard of a car: relatively early in automotive history, we developed a fairly standard set of tools for displaying data and controlling the machinery.  If we hadn’t, the development of automobiles would have been set back by decades: every automaker would need to design its own controls, and you’d need fairly extensive training on your specific car before you could drive it. Programming languages for quantum devices also need to standardize; fortunately, there has already been a lot of work in that direction.  Open source development kits that provide libraries that can be called from Python to perform quantum operations (Qiskit, Braket, and Cirq are some examples), and OpenQASM is an open source “quantum assembly language” that lets programmers write (virtual) machine-level code that can be mapped to instructions on a physical machine.

Another approach to simulating quantum behavior won’t help probe quantum behavior, but might help researchers to develop algorithms for numerical computing. P-bits, or probabilistic bits, behave probabilistically but don’t depend on quantum physics: they’re traditional electronics that work at room temperature. P-bits have some of the behavior of qubits, but they’re much easier to build; the developers call them “poor man’s qubits.” Will p-bits make it easier to develop a quantum future?  Possibly.

It’s important not to get over-excited about quantum computing. The best way to avoid a “trough of disillusionment” is to be realistic about your expectations in the first place. Most of what computers currently do will remain unchanged. There will be some breakthroughs in areas like cryptography, search, and a few other areas where we’ve developed algorithms. Right now, “preparing for quantum computing” means evaluating your cryptographic infrastructure. Given that infrastructure changes are difficult, expensive, and slow, it makes sense to prepare for quantum-safe cryptography now. (Quantum-safe cryptography is cryptography that can’t be broken by quantum computers–it does not require quantum computers.)  Quantum computers may still be 20 years in the future, but infrastructure upgrades could easily take that long.

Practical (numeric) quantum computing at significant scale could be 10 to 20 years away, but a few breakthroughs could shorten that time drastically.  In the meantime, a lot of work still needs to be done on discovering quantum algorithms. And a lot of important work can already be done by using quantum computers as tools for investigating quantum behavior. It is an exciting time; it’s just important to be excited by the right things, and not misled by the hype.

Categories: Technology

Radar trends to watch: May 2022

O'Reilly Radar - Tue, 2022/05/03 - 04:19

April was the month for large language models. There was one announcement after another; most new models were larger than the previous ones, several claimed to be significantly more energy efficient. The largest (as far as we know) is Google’s GLAM, with 1.2 trillion parameters–but requiring significantly less energy to train than GPT-3. Chinchilla has ¼ as many parameters as GPT-3, but claims to outperform it. It’s not clear where the race to bigger and bigger models will end, or where it will lead us. The PaLM model claims to be able to reason about cause and effect (in addition to being more efficient than other large models); we don’t yet have thinking machines (and we may never), but we’re getting closer. It’s also good to see that energy efficiency has become part of the conversation.

  • Google has created GLAM a 1.2 trillion parameter model (7 times the size of GPT-3).  Training GLAM required 456 megawatt-hours,  ⅓ the energy of GPT-3. GLAM uses a Mixture-of-Experts (MoE) model, in which different subsets of the neural network are used, depending on the input.
  • Google has released a dataset of 3D-scanned household items.  This will be invaluable for anyone working on AI for virtual reality.
  • FOMO (Faster Objects, More Objects) is a machine learning model for object detection in real time that requires less than 200KB of memory. It’s part of the TinyML movement: machine learning for small embedded systems.
  • LAION (Large Scale Artificial Intelligence Open Network) is a non-profit, free, and open organization that is creating large models and making them available to the public. It’s what OpenAI was supposed to be. The first model is a set of image-text pairs for training models similar to DALL-E.
  • NVidia is using AI to automate the design of their latest GPU chips
  • Using AI to inspect sewer pipes is one example of an “unseen” AI application. It’s infrastructural, it doesn’t risk incorporating biases or significant ethical problems, and (if it works) it improves the quality of human life.
  • Large language models are generally based on text. Facebook is working on building a language model from spoken language, which is a much more difficult problem.
  • STEGO is a new algorithm for automatically labeling image data. It uses transformers to understand relationships between objects, allowing it to segment and label objects without human input.
  • A researcher has developed a model for predicting first impressions and stereotypes, based on a photograph.  They’re careful to say that this model could easily be used to fine-tune fakes for maximum impact, and that “first impressions” don’t actually say anything about a person.
  • A group building language models for the Maori people shows that AI for indigenous languages require different ways of thinking about artificial intelligence, data, and data rights.
  • A21 is a new company offering a large language model “as a service.” They allow customers to train custom versions of their model, and they claim to make humans and machines “thought partners.”
  • Researchers have found a method for reducing toxic text generated by language models. It sounds like a GAN (generative adversarial network), in which a model trained to produce toxic text “plays against” a model being trained to detect and reject toxicity.
  • More bad applications of AI: companies are using AI to monitor your mood during sales calls.  This questionable feature will soon be coming to Zoom.
  • Primer has developed a tool that uses AI to transcribe, translate, and analyze intercepted communications in the war between Russia and Ukraine.
  • Deep Mind claims that another new large language model, Chinchilla, outperforms GPT-3 and Gopher with roughly ¼th the number of parameters. It was trained on roughly 4 times as much data, but with fewer parameters, it requires less energy to train and fine-tune.
  • Data Reliability Engineering (DRE) borrows ideas from SRE and DevOps as a framework to provide higher-quality data for machine learning applications while reducing the manual labor required. It’s closely related to data-centric AI.
  • OpenAI’s DALL-E 2 is a new take on their system (DALL-E) for generating images from natural language descriptions. It is also capable of modifying existing artworks based on natural language descriptions of the modifications. OpenAI plans to open DALL-E 2 to the public, on terms similar to GPT-3.
  • Google’s new Pathways Language Model (PaLM) is more efficient, can understand concepts, and reason about cause and effect, in addition to being relatively energy-efficient. It’s another step forward towards AI that actually appears to think.
  • SandboxAQ is an Alphabet startup that is using AI to build technologies needed for a post-quantum world.  They’re not doing quantum computing as such, but solving problems such as protocols for post-quantum cryptography.
  • IBM has open sourced the Generative Toolkit for Scientific Discovery (GT4SD), which is a generative model designed to produce new ideas for scientific research, both in machine learning and in areas like biology and materials science.
  • Waymo (Alphabet’s self-driving car company) now offers driverless service in San Francisco.  San Francisco is a more challenging environment than Phoenix, where Waymo has offered driverless service since 2020. Participation is limited to members of their Trusted Tester program.
  • Mastodon, a decentralized social network, appears to be benefitting from Elon Musk’s takeover of Twitter.
  • Reputation and identity management for web3 is a significant problem: how do you verify identity and reputation without giving applications more information than they should have?  A startup called Ontology claims to have solved it.
  • A virtual art museum for NFTs is still under construction, but it exists, and you can visit it. It’s probably a better experience in VR.
  • 2022 promises to be an even bigger year for cryptocrime than 2021. Attacks are increasingly focused on decentralized finance (DeFi) platforms.
  • Could a web3 version of Wikipedia evade Russia’s demands that they remove “prohibited information”?  Or will it lead to a Wikipedia that’s distorted by economic incentives (like past attempts to build a blockchain-based encyclopedia)?
  • The Helium Network is a decentralized public wide area network using LoRaWAN that pays access point operators in cryptocurrency. The network has over 700,000 hotspots, and coverage in most of the world’s major metropolitan areas.
  • Do we really need another shell scripting language?  The developers of hush think we do.  Hush is based on Lua, and claims to make shell scripting more robust and maintainable.
  • Web Assembly is making inroads; here’s a list of startups using wasm for everything from client-side media editing to building serverless platforms, smart data pipelines, and other server-side infrastructure.
  • QR codes are awful. Are they less awful when they’re animated? It doesn’t sound like it should work, but playing games with the error correction built into the standard allows the construction of animated QR codes.
  • Build your own quantum computer (in simulation)?  The Qubit Game lets players “build” a quantum computer, starting with a single qubit.
  • One of Docker’s founders is developing a new product, Dagger, that will help developers manage DevOps pipelines.
  • Can applications use “ambient notifications” (like a breeze, a gentle tap, or a shift in shadows) rather than intrusive beeps and gongs?  Google has published Little Signals, six experiments with ambient notifications that includes code, electronics, and 3D models for hardware.
  • Lambda Function URLs automate the configuration of an API endpoint for single-function microservices on AWS. They make the process of mapping a URL to a serverless function simple.
  • GitHub has added a dependency review feature that inspects the consequences of a pull request and warns of vulnerabilities that were introduced by new dependencies.
  • Google has proposed Supply Chain Levels for Software Artifacts (SLSA) as a framework for  ensuring the integrity of the software supply chain.  It is a set of security guidelines that can be used to generate metadata; the metadata can be audited and tracked to ensure that software components have not been tampered with and have traceable provenance.
  • Harvard and the Linux Foundation have produced Census II, which lists thousands of the most popular open source libraries and attempts to rank their usage.
  • The REvil ransomware has returned (maybe). Although there’s a lot of speculation, it isn’t yet clear what this means or who is behind it. Nevertheless, they appear to be looking for business partners.
  • Attackers used stolen OAuth tokens to compromise GitHub and download data from a number of organizations, most notably npm.
  • The NSA, Department of Energy, and other federal agencies have discovered a new malware toolkit named “pipedream” that is designed to disable power infrastructure. It’s adaptable to other critical infrastructure systems. It doesn’t appear to have been used yet.
  • A Russian state-sponsored group known as Sandworm failed in an attempt to bring down the Ukraine’s power grid. They used new versions of Industroyer (for attacking industrial control systems) and Caddywiper (for cleaning up after the attack).
  • Re-use of IP addresses by a cloud provider can lead to “cloud squatting,” where an organization that is assigned a previously used IP address receives data intended for the previous addressee. Address assignment has become highly dynamic; DNS wasn’t designed for that.
  • Pete Warden wants to build a coalition of researchers that will discuss ways of verifying the privacy of devices that have cameras and microphones (not limited to phones).
  • Cyber warfare on the home front: The FBI remotely accessed devices at some US companies to remove Russian botnet malware. The malware targets WatchGuard firewalls and Asus routers. The Cyclops Blink botnet was developed by the Russia-sponsored Sandworm group.
  • Ransomware attacks have been seen that target Jupyter Notebooks on notebook servers where authentication has been disabled. There doesn’t appear to be a significant vulnerability in Jupyter itself; just don’t disable authentication!
  • By using a version of differential privacy on video feeds, surveillance cameras can provide a limited kind of privacy. Users can ask questions about the image, but can’t identify individuals. (Whether anyone wants a surveillance camera with privacy features is another question.)
Biology and Neuroscience
  • A brain-computer interface has allowed an ALS patient who was completely “locked in” to communicate with the outside world.  Communication is slow, but it goes well beyond simple yes/no requests.
  • CAT scans aren’t just for radiology. Lumafield has produced a table-sized CT-scan machine that can be used in small shops and offices, with the image analysis done in their cloud.
  • Boston Dynamics has a second robot on the market: Stretch, a box-handling robot designed to perform tasks like unloading trucks and shipping containers.
  • A startup claims it has the ability to put thousands of single-molecule biosensors on a silicon chip that can be mass-produced. They intend to have a commercial product by the end of 2022.
Categories: Technology

Building a Better Middleman

O'Reilly Radar - Tue, 2022/04/19 - 05:22

What comes to mind when you hear the term “two-sided market?” Maybe you imagine a Party A who needs something, so they interact with Party B who provides it, and that’s that.  Despite the number “two” in the name, there’s actually someone else involved: the middleman.  This entity sits between the parties to make it easier for them to interact. (We can generalize that “two” to some arbitrary number and call this an N-sided market or multi-sided marketplace. But we’ll focus on the two-sided form for now.)

Two-sided markets are a fascinating study. They are also quite common in the business world, and therefore, so are middlemen. Record labels, rideshare companies, even dating apps all fall under this umbrella.  The role has plenty of perks, as well as some sizable pitfalls.  “Middleman” often carries a negative connotation because, in all fairness, some of them provide little value compared to what they ask in return.

Still, there’s room for everyone involved—Party A, Party B, and the middleman—to engage in a happy and healthy relationship.  In this first article, I’ll explain more about the middleman’s role and the challenges they face.  In the next article, I’ll explore what it takes to make a better middleman and how technology can play a role.

Paving the Path

When I say that middlemen make interactions easier, I mean that they address a variety of barriers:

  • Discovery: “Where do I find the other side of my need or transaction?” Dating apps like OKCupid, classified ads services such as Craigslist, and directory sites like Angi (formerly Angie’s List) are all a twist on a search engine. Party A posts a description of themself or their service, Party B scrolls and sifts the list while evaluating potential matches for fit.
  • Matching: “Should we interact? Are our needs compatible?” Many middlemen that help with discovery also handle the matching for you, as with ride-share apps.  Instead of you having to scroll through lists of drivers, Uber and Lyft use your phone’s GPS to pair you with someone nearby.  (Compared to the Discovery case, Matching works best when one or both counterparties are easily interchangeable.)
  • Standardization: “The middleman sets the rules of engagement, so we all know what to expect.”  A common example would be when a middleman like eBay sets the accepted methods of payment.  By narrowing the scope of what’s possible—by limiting options—the middleman standardizes how the parties interact.
  • Safety: “I don’t have to know you in order to exchange money with you.” Stock market exchanges and credit card companies build trust with Party A and Party B, individually, so the two parties (indirectly) trust each other through the transitive property.
  • Simplicity: “You two already know each other; I’ll insert myself into the middle, to make the relationship smoother.” Stripe and Squarespace make it easier for companies to sell goods and services by handling payments.  And then there’s Squire, which co-founder Songe Laron describes as the “operating system for the barber shop, [handling] everything from the booking, to the payment, to the point of sales system, to payroll,” and a host of other frictions between barber and customer.  In all cases, each party gets to focus on what it does best (selling goods or cutting hair) while the middleman handles the drudgework.
Nice Work, If You can Get It

As far as their business model, middlemen usually take a cut of transactions as value moves from Party A to Party B. And this arrangement has its benefits.

For one, you’re first in line to get paid: Party A pays you, you take a cut, then you pass the rest on to Party B.  Record labels and book publishers are a common example.  They pair a creator with an audience.  All of the business deals for that creator’s work run through the middleman, who collects the revenue from sales and takes their share along the way.

(The music biz is littered with stories of artists getting a raw deal—making a small percentage of revenue from their albums, while the label takes the lion’s share—but that’s another story.)

Then there’s the opportunity for recurring revenue, if Party A and Party B have an ongoing relationship.  Companies often turn to tech staffing agencies to find staff-augmentation contractors.  Those agencies typically take a cut for the entire duration of the project or engagement, which can run anywhere from a few weeks to more than a decade.  The staffing agency makes one hell of a return on their efforts when placing such a long-term contractor. Nice work, if you can get it.

Staffing agencies may have to refund a customer’s money if a contractor performs poorly.  Some middlemen, however, make money no matter how the deal ultimately turns out.  Did I foolishly believe my friend’s hot stock tip, in his drunken reverie, and pour my savings into a bad investment? Well, NYSE isn’t going to refund my money, which means they aren’t about to lose their cut.

A middleman also gets a bird’s-eye view of the relationships it enables.  It sees who interacts with whom, and how that all happens.  Middlemen that run online platforms have the opportunity to double-dip on their revenue model: first by taking their cut from an interaction, then by collecting and analyzing data around each interaction.  Everything from an end-user’s contact or demographic details, to exploring patterns of how they communicate with other users, can be packaged up and resold.  (This is, admittedly, a little shady. We’ll get to middlemen’s abuse of privilege shortly.)

Saddling Some Burdens, Too

Before you rush out to build your own middleman company, recognize that it isn’t all easy revenue.  You first need to breathe the platform into existence, so the parties can interact.  Depending on the field, this can involve a significant outlay of capital, time, and effort.  Then you need to market the platform so that everyone knows where to go to find the Party B to their Party A.

Once it’s up and running, maintenance costs can be low if you keep things simple.  (Consider the rideshare companies that own the technology platform, but not the vehicles in which passengers ride.) But until you reach that cruising altitude, you’re crossing your fingers that things pan out in your favor.  That can mean a lot of sleepless nights and stressful investor calls.

The middleman’s other big challenge is that they need to keep all of those N sides of the N-sided market happy.  The market only exists because all of the parties want to come together, and your service persists only because they want to come together through you.  If one side gets mad and leaves, the other side(s) will soon follow.  Keeping the peace can be a touchy balancing act.

Consider Airbnb.  Early in the pandemic they earned praise from guests by allowing them to cancel certain bookings without penalty.  It then passed those “savings” on to hosts, who weren’t too happy about the lost revenue.  (Airbnb later created a fund to support hosts, but some say it still fell short.)  The action sent a clear—though, likely, unintentional and incorrect—message that Airbnb valued guests more than hosts.  A modern-day version of robbing Peter to pay Paul.

Keeping all sides happy is a tough line for a middleman to walk.  Mohambir Sawhney, from Northwestern University’s McCormick Foundation, summed this up well: “In any two-sided market, you always have to figure out who you’re going to subsidize more, and who you’re going to actually screw more.” It’s easy for outsiders to say that Airbnb should have just eaten the losses—refunded guests’ money while letting hosts keep their take—but that sounds much easier said than done.  In the end, the company still has to subsidize itself, right?

The subsidize versus screw decision calculus gets even more complicated when one side only wants you but doesn’t need you.  In the Airbnb case, the company effectively serves as a marketing arm and payments processor for property owners.  Any sufficiently motivated owner is just one step away from handling that on their own, so even a small negative nudge can send them packing.  (In economics terms, we say that those owners’ switching costs are low.)

The same holds for the tech sector, where independent contractors can bypass staffing firms to hang their own shingle.  Even rideshare drivers have a choice.  While it would be tougher for them to get their own taxi medallion, they can switch from Uber to Lyft.  Or, as many do, they can sign up with both services so that switching costs are effectively zero: “delete Uber app, keep the Lyft app running, done.”

Making Enemies

Even with those challenges, delivering on the middleman’s raison d’être—”keep all parties happy”—should be a straightforward affair.  (I don’t say “easy,” just “straightforward.” There’s a difference.) Parties A and B clearly want to be together, you’re helping them be together, so the experience should be a win all around.

Why, then, do middlemen have such a terrible reputation?  It mostly boils down to greed.

Once a middleman becomes a sufficiently large and/or established player, they become the de facto place for the parties to meet.  This is a near-monopoly status. The middleman no longer needs to care about keeping one or even both parties happy, they figure, because those groups either interact through the middleman or they don’t interact at all. (This also holds true for the near-cartel status of a group of equally unpleasant middlemen.)

Maybe the middleman suddenly raises fees, or sets onerous terms of service, or simply mistreats one side of the pairing.  This raises the dollar, effort, and emotional cost to the parties since they don’t have many options to leave.

Consider food-delivery apps, which consumers love but can take as much as a 30% cut of an order’s revenue.  That’s a large bite, but easier to swallow when a restaurant has a modest take-away business alongside a much larger dine-in experience. It’s quite another story when take-away is suddenly your entire business and you’re still paying rent on the empty dining room space. Most restaurants found themselves in just this position early in the COVID-19 pandemic. Some hung signs in their windows, asking customers to call them directly instead of using the delivery apps.

Involving a middleman in a relationship can also lead to weird principal-agent problems.  Tech staffing agencies (even those that paint themselves as “consultancies”) have earned a special place here.  Big companies hand such “preferred vendors” a strong moat by requiring contractors to pass through them in lieu of establishing a direct relationship. Since the middlemen can play this Work Through Us, or Don’t Work at All card, it’s no surprise that they’ve been known to take as much as 50% of the money as it passes from client to contractor.  The client companies don’t always know this, so they are happy that the staffing agency has helped them find software developers and DBAs. The contractors, many of whom are aware of the large cuts, aren’t so keen on the arrangement.

This is on top of limiting a tech contractor’s ability to work through a competing agency.  I’ve seen everything from thinly-veiled threats (“if the client sees your resume from more than one agency, they’ll just throw it out”) to written agreements (“this contract says you won’t go through another agency to work with this client”).   What if you’ve found a different agency that will take a smaller cut, so you get more money?  Or what if Agency 1 has done a poor job of representing you, while you know that Agency 2 will get it right?  In both cases, the answer is: tough luck.

A middleman can also resort to more subtle ways to mistreat the parties.  Uber has reportedly used a variety of techniques from behavioral science—such as the gamification of male managers pretending to be women—to encourage drivers to work more.  They’ve also been accused of showing drivers and passengers different routes, charging the passenger for the longer way and paying the driver for the shorter way.

It’s Not All Easy Money

To be fair, middlemen do earn some of their cut. They provide value in that they reduce friction for both the buy and sell sides of an interaction.

This goes above and beyond building the technology for a platform.  Part of how the Deliveroos and Doordashes of the world connect diners to restaurants is by coordinating fleets of delivery drivers.  It would be expensive for a restaurant to do this on their own: hiring multiple drivers, managing the schedule, accounting for demand … and hoping business stays hot so that the drivers aren’t paid to sit idle. Similarly, tech staffing firms don’t just introduce you to contract talent. They also handle time-tracking, invoicing, and legal agreements. The client company cuts one large check to the staffing firm, which cuts lots of smaller checks to the individual contractors.

Don’t forget that handling contracts and processing payments come with extra regulatory requirements. Rules often vary by locale, and the middleman has to spend money to keep track of those rules.  So it’s not all profit.

(They can also build tools to avoid rules, such as Uber’s infamous “greyball” system … but that’s another story.)

That said, a middleman’s benefit varies by the industry vertical and even by the client.  Some argue that their revenue cut far exceeds the value they provide. In the case of tech staffing firms, I’ve heard plenty of complaints that recruiters take far too much money for  just “having a phone number” (having a client relationship) and cutting a check, when it’s the contractor who does the actual work of building software or managing systems for the client.

A Win-Win-Win Triangle

Running a middleman has its challenges and risks.  It can also be tempting to misuse the role’s power.  Still, I say that there’s a way to build an N-sided marketplace where everyone can be happy.  I’ll explore that in the next article in this series.

(Many thanks to Chris Butler for his thoughtful and insightful feedback on early drafts of this article.  I’d also like to thank Mike Loukides for shepherding this piece into its final form.)

Categories: Technology

The General Purpose Pendulum

O'Reilly Radar - Tue, 2022/04/12 - 04:59

Pendulums do what they do: they swing one way, then they swing back the other way.  Some oscillate quickly; some slowly; and some so slowly you can watch the earth rotate underneath them. It’s a cliche to talk about any technical trend as a “pendulum,” though it’s accurate often enough.

We may be watching one of computing’s longest-term trends turn around, becoming the technological equivalent of Foucault’s very long, slow pendulum: the trend towards generalization. That trend has been swinging in the same direction for some 70 years–since the invention of computers, really.  The first computers were just calculating engines designed for specific purposes: breaking codes (in the case of Britain’s Bombe) or calculating missile trajectories. But those primitive computers soon got the ability to store programs, making them much more flexible; eventually, they became “general purpose” (i.e., business) computers. If you’ve ever seen a manual for the IBM 360’s machine language, you’ll see many instructions that only make sense in a business context–for example, instructions for arithmetic in binary coded decimal.

That was just the beginning. In the 70s, word processors started replacing typewriters. Word processors were essentially early personal computers designed for typing–and they were quickly replaced by personal computers themselves. With the invention of email, computers became communications devices. With file sharing software like Napster and MP3 players like WinAmp, computers started replacing radios–then, when Netflix started streaming, televisions. CD and DVD players are inflexible, task-specific computers, much like word processors or the Bombe, and their functions have been subsumed by general-purpose machines.

The trend towards generalization also took place within software. Sometime around the turn of the millenium, many of us realized the Web browsers (yes, even the early Mosaic, Netscape, and Internet Explorer) could be used as a general user interface for software; all a program had to do was express its user interface in HTML (using forms for user input), and provide a web server so the browser could display the page. It’s not an accident that Java was perhaps the last programming language to have a graphical user interface (GUI) library; other languages that appeared at roughly the same time (Python and Ruby, for example) never needed one.

If we look at hardware, machines have gotten faster and faster–and more flexible in the process. I’ve already mentioned the appearance of instructions specifically for “business” in the IBM 360. GPUs are specialized hardware for high-speed computation and graphics; however, they’re much less specialized than their ancestors, dedicated vector processors.  Smartphones and tablets are essentially personal computers in a different form factor, and they have performance specs that beat supercomputers from the 1990s. And they’re also cameras, radios, televisions, game consoles, and even credit cards.

So, why do I think this pendulum might start swinging the other way?  A recent article in the Financial Times, Big Tech Raises its Bets on Chips, notes that Google and Amazon have both developed custom chips for use in their clouds. It hypothesizes that the next generation of hardware will be one in which chip development is integrated more closely into a wider strategy.  More specifically, “the best hope of producing new leaps forward in speed and performance lies in the co-design of hardware, software and neural networks.” Co-design sounds like designing hardware that is highly optimized for running neural networks, designing neural networks that are a good match for that specific hardware, and designing programming languages and tools for that specific combination of hardware and neural network. Rather than taking place sequentially (hardware first, then programming tools, then application software), all of these activities take place concurrently, informing each other. That sounds like a turn away from general-purpose hardware, at least superficially: the resulting chips will be good at doing one thing extremely well. It’s also worth noting that, while there is a lot of interest in quantum computing, quantum computers will inevitably be specialized processors attached to conventional computers. There is no reason to believe that a quantum computer can (or should) run general purpose software such as software that renders video streams, or software that calculates spreadsheets. Quantum computers will be a big part of our future–but not in a general-purpose way. Both co-design and quantum computing step away from general-purpose computing hardware. We’ve come to the end of Moore’s Law, and can’t expect further speedups from hardware itself.  We can expect improved performance by optimizing our hardware for a specific task.

Co-design of hardware, software, and neural networks will inevitably bring a new generation of tools to software development. What will those tools be? Our current development environments don’t require programmers to know much (if anything) about the hardware. Assembly language programming is a specialty that’s really only important for embedded systems (and not all of them) and a few applications that require the utmost in performance. In the world of co-design, will programmers need to know more about hardware? Or will a new generation of tools abstract the hardware away, even as they weave the hardware and the software together even more intimately? I can certainly imagine tools with modules for different kinds of neural network architectures; they might know about the kind of data the processor is expected to deal with; they might even allow a kind of “pre-training”–something that could ultimately give you GPT-3 on a chip. (Well, maybe not on a chip. Maybe a few thousand chips designed for some distributed computing architecture.) Will it be possible for a programmer to say “This is the kind of neural network I want, and this is how I want to program it,” and let the tool do the rest? If that sounds like a pipe-dream, realize that tools like GitHub Copilot are already automating programming.

Chip design is the poster child for “the first unit costs 10 billion dollars; the rest are all a penny apiece.”  That has limited chip design to well-financed companies that are either in the business of selling chips (like Intel and AMD) or that have specialized needs and can buy in very large quantities themselves (like Amazon and Google). Is that where it will stop–increasing the imbalance of power between a few wealthy companies and everyone else–or will co-design eventually enable smaller companies (and maybe even individuals) to build custom processors? To me, co-design doesn’t make sense if it’s limited to the world’s Amazons and Googles. They can already design custom chips.  It’s expensive, but that expense is itself a moat that competitors will find hard to cross. Co-design is about improved performance, yes; but as I’ve said, it’s also inevitably about improved tools.  Will those tools result in better access to semiconductor fabrication facilities?

We’ve seen that kind of transition before. Designing and making printed circuit boards used to be hard. I tried it once in high school; it requires acids and chemicals you don’t want to deal with, and a hobbyist definitely can’t do it in volume. But now, it’s easy: you design a circuit with a free tool like Kicad or Fritzing, have the tool generate a board layout, send the layout to a vendor through a web interface, and a few days later, a package arrives with your circuit boards. If you want, you can have the vendor source the board’s components and solder them in place for you. It costs a few tens of dollars, not thousands. Can the same thing happen at the chip level? It hasn’t yet. We’ve thought that field-programmable gate arrays might eventually democratize chip design, and to a limited extent, they have. FPGAs aren’t hard for small- or mid-sized businesses that can afford a few hardware engineers, but they’re far from universal, and they definitely haven’t made it to hobbyists or individuals.  Furthermore, FPGAs are still standardized (generalized) components; they don’t democratize the semiconductor fabrication plant.

What would “cloud computing” look like in a co-designed world? Let’s say that a mid-sized company designs a chip that implements a specialized language model, perhaps something like O’Reilly Answers. Would they have to run this chip on their own hardware, in their own datacenter?  Or would they be able to ship these chips to Amazon or Google for installation in their AWS and GCP data centers?  That would require a lot of work standardizing the interface to the chip, but it’s not inconceivable.  As part of this evolution, the co-design software will probably end up running in someone’s cloud (much as AWS Sagemaker does today), and it will “know” how to build devices that run on the cloud provider’s infrastructure. The future of cloud computing might be running custom hardware.

We inevitably have to ask what this will mean for users: for those who will use the online services and physical devices that these technologies enable. We may be seeing that pendulum swing back towards specialized devices. A product like Sonos speakers is essentially a re-specialization of the device that was formerly a stereo system, then became a computer. And while I (once) lamented the idea that we’d eventually all wear jackets with innumerable pockets filled with different gadgets (iPods, i-Android-phones, Fitbits, Yubikeys, a collection of dongles and earpods, you name it), some of those products make sense:  I lament the loss of the iPod, as distinct from the general purpose phone. A tiny device that could carry a large library of music, and do nothing else, was (and would still be) a wonder.

But those re-specialized devices will also change. A Sonos speaker is more specialized than a laptop plugged into an amp via the headphone jack and playing an MP3; but don’t mistake it for a 1980s stereo, either. If inexpensive, high-performance AI becomes commonplace, we can expect a new generation of exceedingly smart devices. That means voice control that really works (maybe even for those who speak with an accent), locks that can identify people accurately regardless of skin color, and appliances that can diagnose themselves and call a repairman when they need to be fixed. (I’ve always wanted a furnace that could notify my service contractor when it breaks at 2AM.) Putting intelligence on a local device could improve privacy–the device wouldn’t need to send as much data back to the mothership for processing. (We’re already seeing this on Android phones.) We might get autonomous vehicles that communicate with each other to optimize traffic patterns. We might go beyond voice controlled devices to non-invasive brain control. (Elon Musk’s Neuralink has the right idea, but few people will want sensors surgically embedded in their brains.)

And finally, as I write this, I realize that I’m writing on a laptop–but I don’t want a better laptop. With enough intelligence, would it be possible to build environments that are aware of what I want to do? And offer me the right tools when I want them (possibly something like Bret Victor’s Dynamicland)? After all, we don’t really want computers.  We want “bicycles for the mind”–but in the end, Steve Jobs only gave us computers.

That’s a big vision that will require embedded AI throughout. It will require lots of very specialized AI processors that have been optimized for performance and power consumption. Creating those specialized processors will require re-thinking how we design chips. Will that be co-design, designing the neural network, the processor, and the software together, as a single piece? Possibly. It will require a new way of thinking about tools for programming–but if we can build the right kind of tooling, “possibly” will become a certainty.

Categories: Technology

Radar trends to watch: April 2022

O'Reilly Radar - Tue, 2022/04/05 - 04:32

March was a busy month, especially for developers working with GPT-3. After surprising everybody with its ability to write code, it’s not surprising that GPT-3 is appearing in other phases of software development. One group has written a tool that creates regular expressions from verbal descriptions; another tool generates Kubernetes configurations from verbal descriptions. In his newsletter, Andrew Ng talks about the future of low-code AI: it’s not about eliminating coding, but eliminating the need to write all the boilerplate. The latest developments with large language models like GPT-3 suggest that the future isn’t that distant.

On the other hand, the US copyright office has determined that works created by machines are not copyrightable. If software is increasingly written by tools like Copilot, what will this say about software licensing and copyright?

Artificial Intelligence
  • An unusual form of matter known as spin glass can potentially allow the implementation of neural network algorithms in hardware. One particular kind of network allows pattern matching based on partial patterns (for example, face recognition based on a partial face), something that is difficult or impossible with current techniques.
  • OpenAI has extended GPT-3 to do research on the web when it needs information that it doesn’t already have.
  • Data-centric AI is gaining steam, in part because Andrew Ng has been pushing it consistently. Data-centric AI claims that the best way to improve the AI is to improve the data, rather than the algorithms. It includes ideas like machine-generated training data and automatic tagging. Christoper Ré, at one of the last Strata conferences, noted that data collection was the part of AI that was most resistant to improvement.
  • We’ve seen that GPT-3 can generate code from English language comments. Can it generate Kubernetes configurations from natural language descriptions?  Take a look at AI Kube Bot.
  • The US Copyright Office has determined that works created by an artificial intelligence aren’t copyrightable; copyright requires human authorship. This is almost certainly not the final word on the topic.
  • A neural network with a single neuron that is used many times may be as effective as large neural networks, while using much less energy.
  • Training AI models on synthetic data created by a generative model can be more effective than using real-world data. Although there are pitfalls, there’s more control over bias, and the data can be made to include unexpected cases.
  • For the past 70 years, computing has been dominated by general-purpose hardware: machines designed to run any code. Even vector processors and their descendants (GPUs) are fundamentally general purpose. The next steps forward in AI may involve software, hardware, and neural networks that are designed for each other.
  • Ithaca is a DeepMind project that uses deep learning to recover missing texts in ancient Greek documents and inscriptions.  It’s particularly interesting as an example of human-machine collaboration. Humans can do this work with 25% accuracy, Ithaca is 62% accurate on its own, but Ithaca and humans combined reach 72% accuracy.
  • Michigan is starting to build the infrastructure needed to support autonomous vehicles: dedicated lanes, communications, digital signage, and more.
  • Polycoder is an open source code generator (like Copilot) that uses GPT-2, which is also open sourced. Developers claim that Polycoder is better than Copilot for many tasks, including programming in C. Because it is open-source, it enables researchers to investigate how these tools work, including testing for security vulnerabilities.
  • New approaches to molecule design using self-supervised learning on unlabeled data promise to make drug discovery faster and more efficient.
  • The title says it all. Converting English to Regular Expressions with GPT-3, implemented as a Google sheet. Given Copilot, it’s not surprising that this can be done.
  • Researchers at MIT have developed a technique for injecting fairness into a model itself, even after it has been trained on biased data.
  • Low code programming for Python: Some new libraries designed for use in Jupyter Notebooks (Bamboo, Lux, and Mito) allow a graphical (forms-based) approach to working with data using Python’s Pandas library.
  • Will the Linkerd service mesh displace Istio?  Linkerd seems to be simpler and more attractive to small and medium-sized organizations.
  • The biggest problem with Stack Overflow is the number of answers that are out of date.  There’s now a paper studying the frequency of out-of-date answers.
  • Silkworm-based encryption: Generating good random numbers is a difficult problem. One surprising new source of randomness is silk.  While silk appears smooth, it is (not surprisingly) very irregular at a microscopic scale.  Because of this irregularity, passing light through silk generates random diffraction patterns, which can be converted into random numbers.
  • The Hub for Biotechnology in the Built Environment (HBBE) is a research center that is rethinking buildings. They intend to create “living buildings” (and I do not think that is a metaphor) capable of processing waste and producing energy.
  • A change to the protein used in CRISPR to edit DNA reduces errors by a factor of 4000, without making the process slower.
  • Researchers have observed the process by which brains store sequences of memories.  In addition to therapies for memory disorders, this discovery could lead to advances in artificial intelligence, which don’t really have the ability to create and process timelines or narratives.
  • Object detection in 3D is a critical technology for augmented reality (to say nothing of autonomous vehicles), and it’s significantly more complex than in 2D. Facebook/Meta’s 3DETR uses transformers to build models from 3D data.
  • Some ideas about what Apple’s AR glasses, Apple Glass, might be. Take what you want… Omitting a camera is a good idea, though it’s unclear how you’d make AR work. This article suggests LIDAR, but that doesn’t sound feasible.
  • According to the creator of Pokemon Go, the metaverse should be about helping people to appreciate the physical world, not about isolating them in a virtual world.
  • Jeff Carr has been publishing (and writing about) dumps of Russian data obtained by hackers from GRUMO, the Ukraine’s cyber operations team.
  • Sigstore is a new kind of certificate authority (trusted root) that is addressing open source software supply chain security problems.  The goal is to make software signing “ubiquitous, free, easy, and transparent.”
  • Russia has created its own certificate authority to mitigate international sanctions. However, users of Chrome, Firefox, Safari, and other browsers originating outside of Russia would have to install the Russian root certificate manually to access Russian sites without warnings.
  • Corporate contact forms are replacing email as a vector for transmitting malware. BazarBackdoor [sic] is now believed to be under development by the Conti ransomware group.
  • Dirty Pipe is a newly discovered high-severity bug in the Linux kernel that allows any user to overwrite any file or obtain root privileges. Android phones are also vulnerable.
  • Twitter has created an onion service that is accessible through the Tor network. (Facebook has a similar service.)  This service makes Twitter accessible within Russia, despite government censorship.
  • The attackers attacked: A security researcher has acquired and leaked chat server logs from the Conti ransomware group. These logs include discussions of victims, Bitcoin addresses, and discussions of the group’s support of Russia.
  • Attackers can force Amazon Echo devices to hack themselves. Get the device to speak a command, and its microphone will hear the command and execute it. This misfeature includes controlling other devices (like smart locks) via the Echo.
  • The Anonymous hacktivist collective is organizing (to use that word very loosely) attacks against Russian digital assets. Among other things, they have leaked emails between the Russian defense ministry and their suppliers, and hacked the front pages of several Russian news agencies.
  • The Data Detox Kit is a quick guide to the bot world and the spread of misinformation.  Is it a bot or not?  This site has other good articles about how to recognize misinformation.
  • Sensor networks that are deployed like dandelion seeds! An extremely light, solar-powered framework for scattering of RF-connected sensors and letting breezes do the work lets researchers build networks with thousands of sensors easily. I’m concerned about cleanup afterwards, but this is a breakthrough, both in biomimicry and low-power hardware.
  • Semiconductor-based LIDAR could be the key to autonomous vehicles that are reasonably priced and safe. LIDAR systems with mechanically rotating lasers have been the basis for Google’s autonomous vehicles; they are effective, but expensive.
  • The open source instruction set architecture RISC-V is gaining momentum because it is enabling innovation at the lowest levels of hardware.
Quantum Computing
  • Microsoft claims to have made a breakthrough in creating topological qubits, which should be more stable and scalable than other approaches to quantum computing.
  • IBM’s quantum computer was used to simulate a time crystal, showing that current quantum computers can be used to investigate quantum processes, even if they can’t yet support useful computation.
  • Mozilla has published their vision for the future evolution of the web. The executive summary highlights safety, privacy, and performance. They also want to see a web on which it’s easier for individuals to publish content.
  • Twitter is expanding its crowdsourced fact-checking program (called Birdwatch). It’s not yet clear whether this has helped stop the spread of misinformation.
  • The Gender Pay Gap Bot (@PayGapApp) retweets corporate tweets about International Womens’ Day with a comment about the company’s gender pay gap (derived from a database in the UK).
  • Alex Russell writes about a unified theory of web performance.  The core principle is that the web is for humans. He emphasizes the importance of latency at the tail of the performance distribution; improvements there tend to help everyone.
  • WebGPU is a new API that gives web applications the ability to do rendering and computation on GPUs.
Blockchains and NFTs Business
Categories: Technology
Subscribe to LuftHans aggregator