You are here

Feed aggregator

The Problem with Intelligence

O'Reilly Radar - Tue, 2022/09/13 - 04:21

Projects like OpenAI’s DALL-E and DeepMind’s Gato and LaMDA have stirred up many discussions of artificial general intelligence (AGI). These discussions tend not to go anywhere, largely because we don’t really know what intelligence is. We have some ideas–I’ve suggested that intelligence and consciousness are deeply connected to the ability to disobey, and others have suggested that intelligence can’t exist outside of embodiment (some sort of connection between the intelligence and the physical world). But we really don’t have a definition. We have a lot of partial definitions, all of which are bound to specific contexts.

For example, we often say that dogs are intelligent. But what do we mean by that? Some dogs, like sheep dogs, are very good at performing certain tasks. Most dogs can be trained to sit, fetch, and do other things. And they can disobey. The same is true of children, though we’d never compare a child’s intelligence to a dog’s. And cats won’t do any of those things, though we never refer to cats as unintelligent.

I’m very impressed with Irene Pepperberg’s work on parrot intelligence. She’s shown that her parrots can have an understanding of numbers, can use language intelligently, and can even invent new vocabulary. (“Banerry” for apple, probably because birds don’t have lips and can’t say Ps very well. And apples look like giant cherries and taste like bananas, at least to parrots.) But I wonder if even this is getting the question wrong. (I think Dr. Pepperberg would agree.) We ask birds to be intelligent about things humans are intelligent about. We never ask humans to be intelligent about things birds are intelligent about: navigating in three-dimensional space, storing food for use during winter (a boreal chickadee will store as many as 80,000 seeds in different places, and remember where they’re all located), making use of the many colors birds see that we can’t (their vision extends well into the ultraviolet). It’s easy to imagine a bird thinking, “Those poor humans. They can’t find their home without taking out that strange little black box (which is actually colored octarine).”

In a similar vein, we often say that dolphins and elephants are intelligent, but it’s never clear what exactly we mean by that. We’ve demonstrated that dolphins can recognize patterns and that they recognize themselves in mirrors, and they’ve demonstrated a (limited) ability to communicate with humans, but their intelligence certainly goes much further. I wouldn’t be the least bit surprised if animals like dolphins had an oral literature. We penalize them on the intelligence scale because they don’t have hands and can’t pick up a pen. Likewise, some research shows that elephants communicate with each other using low frequency rumbles that can be heard for miles (if you’re an elephant). Information theory suggests that this communication can’t be fast, but that doesn’t mean that it can’t be rich.

Humans are intelligent. After all, we get to define what “intelligence” means. Controlling the definition of intelligence has always been a source of cultural and political power; just read anything written in America in the 19th century about the intelligence of women, Asians, Africans, or even the Irish and Italians. We have “intelligence tests” to measure intelligence–or do they just measure test-taking ability? We also talk about “emotional” and other kinds of intelligence. And we recognize that mathematical, linguistic, and artistic ability rarely go hand-in-hand. Our own view of our own intelligence is highly fractured, and often has more to do with pseudo-science than anything we could use as a metric in machine learning experiments. (Though GPT-3 and LaMDA are no doubt very good at taking tests.)

Finally, there’s also been a lot of talk recently about the possibility of discovering life on other planets. Life is one thing, and my decidedly amateur opinion is that we will find life fairly common. However, to discover intelligent life, we would need a working definition of intelligence. The only useful definition I can imagine is “able to generate signals that can be received off planet and that are indisputably non-natural.” But by that definition, humans have only been intelligent for roughly 100 years, since the early days of radio. (I’m not convinced that the early electrical experiments from the 19th century and spark-based radio from the first two decades of the 20th century could be detected off planet.) There may be fantastically intelligent creatures living under the ice covering Saturn’s moon Titan, but we’ll never be able to detect them without going there. For Titan, a visit may be possible. For planets elsewhere in our galaxy, probably not.

Even more important: these definitions aren’t just different. They’re different in kind. We’re not saying that a parrot or a crow is intelligent if it scores 0.3 (on a scale of 0 to 1) on some test, but an autonomous vehicle has to score .99. The definitions aren’t remotely comparable. I don’t know what it would mean to ask GPT-3 about soaring on air currents. If we asked, we would get an answer, and quite likely a good one with a lot of information about aerodynamics, but would that have anything to do with an eagle’s understanding of flight? I could tell Gato to “sit,” but how would I know if it complied?

So what does this tell us about intelligence that’s artificial? Context is important; an appropriate definition of “intelligence” has to start with what we want the system to do. In some cases, that’s generating publishable papers and good PR. With natural language systems like GPT-3, we tend to ignore the fact that you often have to try several prompts to produce reasonable output. (Would we consider a human intelligent if they had to try 5 times to answer a question?) As has often been noted, systems like GPT-3 often get basic facts wrong. But humans often respond to prompts incoherently, and we frequently get our facts wrong.  We get things wrong in different ways, and for different reasons; investigating those differences might reveal something about how our intelligence works, and might lead us to a better understanding of what an “artificial intelligence” might mean.

But without that investigation, our standard for intelligence is fairly loose. An AI system for making product recommendations can be successful even if most of the recommendations are wrong–just look at Amazon. (I’m not being ironic. If there are 10 recommendations and you’re interested in one of them, Amazon has won.) An AI system for an autonomous vehicle has to work to a much higher standard. So do many systems where safety isn’t an issue. We could happily talk about the “intelligence” of an AI chess engine that can beat the average human player, but a chess playing product that can only beat the average human and couldn’t play on a world championship level would be an embarrassment.

Which is just to say that intelligence, especially of the artificial sort, is many things. If you read Turing’s paper on the Imitation Game, you’ll see quickly that Turing is more interested in the quality of the interaction than the correctness of the result. In his examples, the machine says that it’s not good at writing poetry; hesitates before giving answers; and even gets some results wrong. Turing’s thought experiment is more about whether a machine can behave like a human than about whether it can master many different disciplines. The word “intelligence” only appears once in the body of the paper, and then it refers to a human experimenter.

That leads me to a conclusion: Intelligence doesn’t have any single definition, and shouldn’t. Intelligence is always specific to the application.  Intelligence for a search engine isn’t the same as intelligence for an autonomous vehicle, isn’t the same as intelligence for a robotic bird, isn’t the same as intelligence for a language model. And it certainly isn’t the same as the intelligence for humans or for our unknown colleagues on other planets.

If that’s true, then why are we talking about “general intelligence” at all?  General intelligence assumes a single definition. Discarding the idea of a single unifying definition of “intelligence” doesn’t cost us much, and gains a lot: we are free to create definitions of “intelligence” that are appropriate to specific projects. When embarking on a new project, it’s always helpful to know exactly what you’re trying to achieve. This is great for practical, real-world engineering. And even big, expensive research projects like DALL-E, Gato, LaMDA, and GPT-3 are ultimately engineering projects. If you look beyond the link-bait claims about general intelligence, sentience, and the like, the computer scientists working on these projects are working against well-defined benchmarks. Whether these benchmarks have anything to do with “intelligence” isn’t relevant. They aren’t trying to create an artificial human, or even an artificial dog. (We’ll leave artificial dogs to Boston Dynamics.) They are trying–with considerable success–to extend the range of what computers can do. A model that can work successfully in over 600 different contexts is an important achievement. Whether or not that’s “general intelligence” (or intelligence at all) is a side show we don’t need.

Categories: Technology

Radar Trends to Watch: September 2022

O'Reilly Radar - Tue, 2022/09/06 - 04:21

It’s hardly news to talk about the AI developments of the last month. DALL-E is increasingly popular, and being used in production. Google has built a robot that incorporates a large language model so that it can respond to verbal requests. And we’ve seen a plausible argument that natural language models can be made to reflect human values, without raising the question of consciousness or sentience.

For the first time in a long time we’re talking about the Internet of Things. We’ve got a lot of robots, and Chicago is attempting to make a “smart city” that doesn’t facilitate surveillance. We’re also seeing a lot in biology. Can we make a real neural network from cultured neurons? The big question for biologists is how long it will take for any of their research to make it out of the lab.

Artificial Intelligence
  • Stable Diffusion is a new text-to-image model that has been designed to run on consumer GPUs. It has been released under a license that is similar to permissive open source licenses, but has restrictions requiring the model to be used ethically.
  • Researches claim that they can use a neural network to reconstruct images (specifically, faces) that humans are seeing. They use fMRI to collect brain activity, and a neural decoding algorithm to turn that activity into images that are scarily similar to the photos the subjects were shown.
  • Research from Google and other institutions investigates the emergent properties of large language models: their ability to do things that can’t be predicted by scale alone.
  • DALL-E’s popularity is soaring and, like Copilot, it’s being adopted as a tool. It’s fun to play with, relatively inexpensive, and it’s increasingly being used for projects like designing logos and generating thumbnail images for a blog.
  • Elon Musk has announced that Tesla will have a robot capable of performing household chores by the end of 2022. That is almost certainly overly ambitious (and we hope it works better than his self-driving vehicles), but it’s no doubt coming.
  • Google has demonstrated a robot that can respond to verbal statements (for example, bringing food when you say “I’m hungry”) without being trained on those specific statements; it uses a large language model to interpret the statement and determine a response.
  • Molecular modeling with deep learning has been used to predict the way ice forms. This can be very important for understanding weather patterns; the technique may be applicable to developing new kinds of materials.
  • Brain.js is a deep learning library for JavaScript, designed to run in the browser and using the computer’s GPU (if available).
  • Graph neural networks may be able to predict sudden flareups in burning homes, the largest cause of death among firefighters.
  • While avoiding the question of whether language models are “intelligent,” Blaise Aguera y Arcas argues that language models can be trained to reflect particular moral values and standards of behavior.
  • A webcam mounted on a 3-D gimbal uses AI to automatically track moving objects. This could be a step towards making virtual reality less virtual.
  • A new political party in Denmark has policies determined entirely by AI. The Synthetic Party plans to run candidates for parliament in 2023.
  • One irony of AI work is that neural networks are designed by human intuition. Researchers are working on new AutoML systems that can quickly and efficiently design neural networks for specific tasks.
  • To succeed at deploying AI, AI developers need to understand and use DevOps methods and tools.
  • Cerebras, the company that released a gigantic (850,000 core) processor, claims their chip will democratize the hardware needed to train and run very large language models by eliminating the need to distribute computation across thousands of smaller GPUs.
  • Large language models are poor at planning and reasoning. Although they have done well on “common sense” benchmarks, they fail at planning tasks requiring more than one or two steps and that can’t be solved by doing simple statistics on their training data. Better benchmarks for planning and reasoning are needed to make progress.
  • A GPT-3 based application can answer philosophical questions well enough to fool philosophers. The authors make it clear that the machine is not “thinking”; it was intended as an experiment to demonstrate the danger of automated plagiarism.
  • DoNotPay has built a tool that finds racist language in real estate documents, and automates the process of having it removed. Not very surprisingly, it quickly discovered that clauses preventing the sale of property to non-Whites are extremely common.
  • Researchers have developed analog “neurons” that can build analog neural networks programmed similarly to digital neural networks. They are potentially much faster and require much less power.
  • A startup called Language I/O does machine translation by leveraging translations from Google, Facebook, and Amazon, then uses AI to choose the best and fine-tune the result, using customer-supplied vocabularies with minimal training.
  • KSplit is an automated framework for isolating operating system device drivers from each other and the OS kernel. Isolating device drivers is very difficult for human programmers, but greatly reduces vulnerabilities and bugs.
  • A report on the API economy says that one of the biggest obstacles is the lack of API design skills.
  • A bit of history comes back to life: An archive of everything written by why the lucky stiff (aka _why) is now online. _why was a mainstay of the Ruby community in the early 2000s; he disappeared from the community and took all of his content offline when a reporter revealed his name. Well worth reading; maybe even well worth re-acquainting yourself with Ruby.
  • Bun is a new JavaScript framework that aims to replace Node.js. It’s still early, and doesn’t yet support some important NPM packages. But it’s very fast, and (like Deno) implements Typescript.
  • Observability needs to “shift left”: that is, become a primary concern of developers, in addition to operations. Most observability tools are oriented towards operations, rather than software development.
  • mCaptcha is a proof-of-work-based Captcha system that avoids any human interaction like identifying images. It imposes a small penalty on genuine users that actors who want to pound web sites at scale won’t be willing to pay.
  • RStudio is renaming itself Posit. We don’t normally deal in corporate names, but this change is significant. Although R will remain a focus, RStudio has been looking beyond R; specifically, they’re interested in Python and their Jupyter-based Quarto publishing system.
  • Google is releasing open source tools for designing chips, and funding a program that allows developers to have their custom designs built at a fabrication facility. The goal is to jump-start an open source ecosystem for silicon.
  • Zero trust adoption has soared in the past year. According to Okta, 97% of the respondents to their recent “state of zero trust” survey say they have zero trust initiatives in place, or will have them within the next year.
  • An online tool named InAppBrowser can detect whether the browsers that are built in to mobile apps can be used to inject JavaScript into sites you visit. This kind of JavaScript injection isn’t always dangerous, but is often used to inject tracking code.
  • Google blocked a distributed denial of service attack (DDOS) against one of its cloud customers that peaked at 26 million requests per second, a record. The customer was using Google’s Cloud Armor service.
  • Chatbots backed by AI and NLP are becoming a significant problem for security. Well-designed chatbots can perform social engineering, execute denial of service attacks on customer service by generating complaints, and generate fake account credentials in bulk.
  • A security researcher has created a $25 tool that allows users to run custom code on terminals for the Starlink network. It requires attaching a board to your dish, but we suspect that enough Starlink users would be interested in “exploring” the satellite network to become a serious problem.
  • Message Franking is a cryptographic technology that includes end-to-end encryption, but also allows abusers to be held to account for misinformation–without revealing the content of the message.
  • One trick for detecting live deepfakes in video calls: ask the caller to turn sideways. Deepfake software is good at generating head-on views, but tends to fail badly at profiles.
  • Bruce Schneier on cryptographic agility: We need the ability to swap in cryptographic algorithms quickly, in light of the possibility that quantum computers will soon be able to break current codes. Industry adoption of new algorithms takes a long time, and we may not have time.
  • SHARPEXT is malware that installs a browser extension on Chrome or Edge that allows an attacker to read gmail. It can’t be detected by email services. Users are tricked into installing it through a phishing attack.
  • Passage offers biometric authentication services that work across devices using WebAuthn. Biometric data is encrypted, of course, and never leaves the user’s device.
  • Watch the progress of the American Data Privacy Protection Act, which has bipartisan support in Congress. This is the first serious attempt to provide nationwide digital privacy standards in the US.
  • A lawsuit filed in California claims that Oracle is selling a detailed social graph that incorporates information about 5 billion distinct users, roughly ⅔ the population of the planet. This information was gathered almost entirely without consent.
  • Finland is planning to test digital passports later this year. Volunteers with digital passports will be issued a smartphone app, rather than papers. Digital passports will require travelers to send plans to border control agencies, and a photo of them will be taken at the border.
  • A startup is attempting to grow a new liver inside a human body, as an alternative to a transplant. They will inject the patient’s lymph nodes with cells that will hopefully be able to reproduce and function as an alternate liver.
  • Tiny caps for tiny brains: Researchers have developed “caps” that can measure activity in brain organoids (cultured clusters of human neurons). It’s possible that groups of organoids can then be connected and networked. Is this the next neural network?
  • A bioengineered cornea made from collagen collected from pig skin, could be an important step in treating keratoconus and other causes of blindness. Artificial corneas would eliminate the problem of donor shortage, and can be stored for much longer than donated corneas.
  • A startup in Israel is creating artificial human embryos from human cells. These embryos, which survive for several days but are not viable, could be used to harvest very early-stage organs for transplants.
  • Materials that can think: Researchers have developed a mechanical integrated circuit that can respond to physical stresses, like touch, and perform computation on those stresses, and generate digital output.
  • Eutelsat, a European satellite operator, has launched a commercial “software defined satellite”: a satellite that can be reconfigured for different missions once it’s in space.
  • Developing robots just got easier. Quad-SDK is an open source stack for four-legged locomotion that’s compatible with ROS, the Robot Operating System.
  • Artificial intelligence isn’t just about humans. A startup is reverse-engineering insect brains to develop efficient vision and motion systems for robots.
  • A Japanese firm has developed robots that are being used to stock shelves in a convenience store chain.
  • Chicago’s Array of Things is an edge network for a smart city: an array of inexpensive temporary sensors to report on issues like traffic, safety, and air quality. Although the sensors include cameras, they only send processed data (not video) and can’t be used for surveillance.
  • The US Department of Energy is funding research on using sensors, drones, and machine learning to predict and detect wildfires. This includes identifying power line infrastructure that’s showing signs of arcing and in need of maintenance.
  • The UK is developing “flyways” for drones: Project Skyway will reserve flight paths for drone aircraft between six major cities.
Work Web3
  • Ethereum will be moving to proof-of-stake in September. Fred Wilson has an analysis of what this will mean for the network. The current proof-of-work blockchain will continue to exist.
  • Beginning in November, international payments will begin moving to blockchains, based on the ISO 20022 standard. A small number of cryptocurrencies comply with this standard. (Bitcoin and Ethereum are not on the list.)
  • Application-specific blockchains, or appchains, may be the way to go, rather than using a Layer 1 blockchain like Ethereum. Appchains can be built to know about each other, making it easier to develop sophisticated applications; and why let fees go to the root blockchain’s miners?
  • Cryptocurrency scans and thefts are old news these days, but now we’ve seen the first decentralized robbery. The attackers posted a “how to” on public servers, allowing others to join in the theft, and giving the original thieves cover.
Quantum Computing
  • Practical quantum computers may still be years away, but Quantum Serverless is coming. For almost all users, quantum computers will be in some provider’s cloud, and they’ll be programmed using APIs that have been designed for serverless access.
Categories: Technology

Ad Networks and Content Marketing

O'Reilly Radar - Tue, 2022/08/16 - 04:21

In a recent Radar piece, I explored N-sided marketplaces and the middlemen who bring disparate parties together. One such marketplace is the world of advertising, in which middlemen pair hopeful advertisers with consumer eyeballs. And this market for attention is absolutely huge, with global ad spend weighing in at $763 billion in 2021 revenues.

Most of that money is spent on digital ads, like the ones that follow you across websites to offer you deals on items you’ve just bought. Those are typically based on your online activity. Ad networks trail behind you as you browse the web, trying to get an idea of who you are and what you’re likely to buy, so they can pair you with hopeful merchants.

While merchants are clearly happy with targeted ads—at least, I’d hope so, given how much they’re spending—consumers have, understandably, expressed concerns over personal privacy. Apple took note, and limited iOS apps’ ability to track users across sites. Google has announced changes that would further limit advertisers’ reach. Who knows? Maybe the next step will be that the ad industry gets stronger regulations.

There’s also the question of whether targeted advertising even works.  While the ad networks aren’t required to disclose their stats, there are even people inside those companies who think that their product is “almost all crap.”

Maybe it’s time for a different approach? Recently, Disney’s video streaming service, Disney+, threw its hat into the advertising ring by announcing a new ad-supported plan. (Credit where it’s due: I originally found this in Les Echos, which may be paywalled. Here’s the official, English-language press release from Disney.)

It may be easy to disregard this Disney+ move, since so much of the online world is ad-supported these days. But I think this merits more attention than it may seem on the surface.

To be clear: I have no inside information here. But it at least looks like Disney+ can run its ad platform in a fairly low-tech fashion while also preserving privacy. That’s a pretty big deal for Disney, for consumers, and for the wider space of online advertising.

Everything old is new again

To understand why, let’s first consider the idea of “content marketing.” This is a new term for the age-old practice of selling ad space next to curated content that aligns with a particular theme. For example, let’s say you’ve created a magazine about cars. Motoring enthusiasts will read your magazine, which means advertisers (merchants) who want to reach them will place ads in your pages. The content is what draws readers and advertisers to the same spot.

What’s nice about content marketing is that the ad’s placement is based on the content, not the specific person reading it.

This addresses the privacy concern at the core of targeted advertising, because content marketing doesn’t require that you build a detailed profile of a person based on their every browsing habit. You’re not pairing an ad to a person; you’re pairing an ad to a piece of content. So you shift your analytical focus from the reader to what they’re reading.

The mouse has a large library

Now, consider Disney: its catalog spans decades’ worth of cartoons, tween sitcoms, and movies. Its recent acquisition of the Star Wars franchise gives it access to an even wider fanbase. And don’t forget that Disney owns ESPN, which adds sports content to the portfolio. It now makes that content available through its video-on-demand (VOD) platform of Disney+.

Disney already has to keep track of that catalog of content as part of its day-to-day business, which means we can reasonably assume that every show, movie, and sporting event on Disney+ has been assigned some number of descriptive tags or labels.

From the perspective of content marketing, all of this adds up to Disney+ being able to place ads on that content without having to do much extra work. The parent company, Disney, already owns the content and it’s already been tagged. The depth and breadth of the video catalog will certainly attract a large number and wide variety of viewers. That shifts the heavy lifting to the ad-matching system, which connects advertisers with the content.

Tracking your ad budget

You’ve likely heard the John Wanamaker adage: “Half the money I spend on advertising is wasted; the trouble is, I don’t know which half.” It’s a well-founded complaint about billboard or magazine advertising, since an advertiser can’t really tell how many people saw a given ad.

(Some early advertising pioneers, David Ogilvy among them, learned to supply coupons with print ads so stores could track which one had resonated the most. While this added a new level of analytical rigor to the field, it still wasn’t a perfect solution to Wanamaker’s plight.)

Delivering content-based ads through a well-curated streaming platform addresses that somewhat. Disney+ can provide an advertiser a detailed analysis of their ad spend without revealing any individual’s identity: “N number of people watched Variant V, your ad for Product P, during Show S, with the following breakdowns for time of day…”

And that leads me to my next point:

Minimal ML/AI

When you review the setup—a curated and labeled catalog, with broad-brush marketing characteristics—Disney+ has the ability to run this ad service using minimal ML/AI.

(Once again: I’m speculating from the outside here. I don’t know for sure how much ML/AI Disney+ is using or plans to use. I’m working through one hypothetical-yet-seemingly-plausible scenario.)

Disney+ can use those content labels—”pro football,” “tween comedy,” “gen-X cartoon”—to pair a piece of content with an advertisement. They may not get a perfect hit rate on these ads; but given that they’re building on top of work they’ve already done (the catalog and the streaming platform) then the ad system can run at a relatively low cost. And providing stats to advertisers is a matter of counting. Since those calculations are so trivial, I expect the toughest part of that BI will be scaling it to Disney’s audience size.

Can Disney+ still use ML/AI in places? They most certainly can, but they don’t have to. Disney+ has the option to run this using a smaller team of data scientists and a far smaller data analysis infrastructure. Whether you call this “smaller budget” or “higher margins,” the net effect is the same: the company ends the day with money in its pocket.

Disney+ can task that ML team with building models that better tag content, or that improve matches between content and advertisers. They don’t have to spend money analyzing the specific actions of a specific individual in the hopes of placing ads.

Future-proofing the ad system

Assuming that the Disney+ ad system will indeed run on a content marketing concept, that means the company has one more card to play: They have just sidestepped potential future privacy laws that limit the use of personal information.

Yes, Disney+ can get a person’s contact information when they subscribe to the service. Yes, the company can track customer behavior on- and off-platform, through a mix of first- and third-party data. But, contrary to targeted advertising, they don’t need all of that to run ads. All the company needs is to pair content with an advertisement. Given that this is the modern-day equivalent of a billboard or newspaper article, I imagine it would be difficult for Disney+ to run afoul of any present-day or upcoming privacy regulation with such an ad setup.

There’s still some room for trouble…

Going back to our car magazine example, Disney’s library is the equivalent of hundreds or even thousands of magazines. And if a single magazine is a hint as to a single interest, what can a larger number of magazines tell us?

By tracking what content a person watches, how they watch it (phone, tablet, TV), and what time of day, Disney+ could infer quite a bit about that person and household: the number and age of adults; marital or relationship status; age and number of children; whether this is a multi-generational household; and even some clues as to viewers’ gender. (I emphasize the term “infer” here, since it would hardly be perfect.)

In turn, Disney could use this for ad targeting, or to provide even more-detailed breakdowns to advertisers, or even find ways to share the data with other companies. This could get creepy quickly, so let’s hope they don’t take this route. And based on what we’ve covered thus far, Disney+ has every opportunity to run an ad network that preserves a reasonable amount of privacy.

Could the tail someday wag the dog?

Another possible wrinkle would be in how advertising weighs on future content.

Disney already has a good eye for what people will want to watch. And right now, those viewers are Disney’s customers. But when Disney+ becomes an ad marketplace, they’ll officially be a middleman, which means they’ll have to keep both sides of the ad equation happy. At what point does Disney use the Disney+ advertising as a compass, feeding back into decisions around what content to create?

And would Disney ever stretch beyond its own character lines, to build TV and movies around someone else’s toys?  It’s not too far-fetched of an idea. In The Great Beanie Baby Bubble, author Zac Bisonette points out that:

[A TV show deal] was the kind of product-based programming that was responsible for billions per year in sales and could turn toys that no one wanted into hits through sheer exposure. Lines such as He-Man, My Little Pony, and the ThunderCats had all become hundred-million-dollar brands with the help of the product-based TV shows that accompanied their launches.

Creating content in one side of the businesses while running ads in the other, it’s not unlike running an investment bank and retail bank under one roof: sure, it can lead to all kinds of interesting business opportunities.  It can also lead to trouble.

When it comes to content marketing, you need to strike a balance: you want to create evergreen content, so you can continue to run ads. And when that content is going into the Disney catalog—some of which currently spans multiple generations—it has to be absolutely timeless. Giving in to the whims of a single advertiser, or a single fad, can lead to short-term gains but also short-lived content.

Beyond the Magic Kingdom

Despite those challenges, content marketing has huge potential for generating revenue, preserving privacy, and avoiding future regulation that could hinder targeted advertising. By building this system on BI and content tagging, Disney could do so at a smaller price tag than an AI-based, targeted-ad marketplace.

And this isn’t just a Disney opportunity. I’ve focused on them in this piece but other VOD providers have already seen the benefit in monetizing their catalog. According to Jason Kilar, former CEO of WarnerMedia, “Close to 50% of every new [HBO Max] subscriber is choosing the ad tier. Hulu, the last stat they shared publicly, is they are north of 60%.” Amazon will rename its ad-supported IMDb TV service to Freevee. (I first saw this in Der Spiegel; I’ve since found a US  press release.)  And Netflix, long a holdout in the ad-supported space, hinted at plans for a similar offering.

To be clear, content marketing at this scale is not exactly a get-rich-quick scheme. It works best for groups that already have a large amount of content—video, image, text, audio—that they can monetize. This certainly holds true for the platforms I’ve just mentioned. Maybe it’s also true for your company?

It may require getting creative as you comb through your attic. And maybe there’s an option for a new kind of ad marketplace, one that groups people with a small amount of content into a larger content ecosystem. Sort of like what EthicalAds does for developer documentation. If low-cost, non-invasive content marketing is an option, it can’t hurt to try.

Many thanks to Chris Butler for reviewing an early draft of this article. I always appreciate his insights. The section on the tail wagging the dog was based on his idea and I give him full credit for pointing this out to me.

Categories: Technology

Topics for Meeting on Aug 11th

PLUG - Thu, 2022/08/11 - 14:55
This is a remote meeting. Please join by going to at 7pm on Thursday August 11th

There are 2 topics this month. Firefox: Multi-Account Containers and Virtual Data Optimizer (VDO) - Data Reduction for Block Storage

der.hans: Firefox: Multi-Account Containers

Use Firefox Multi-Account Containers (FMAC) to isolate websites in your browser.
Use Firefox Multi-Account Containers to use multiple web site accounts in the br owser.
FMAC blocks containers from accessing cookies in other containers.
Attendees will learn about:
  • Firefox add-ons
  • Firefox containers
  • browser privacy
  • browser security
  • walled garden containers

About der.hans:
der.hans works remotely in the US. In the FLOSS community he is active with conferences and local groups.

He's chairman of PLUG, chair for SeaGL Finance committee, founder of SeaGL Career Expo, RaiseMe career counselor, BoF organizer for the Southern California Linux Expo (SCaLE) and founder of the Free Software Stammtisch.

He speaks regularly at community-led FLOSS conferences such as SeaGL, Tux-Tage, Penguicon, SCaLE, LCA, FOSSASIA, Tübix, CLT, Kieler Linux Tage, LFNW, OLF, SELF and GeekBeacon Fest.

Hans is available to speak remotely for local groups.
Currently leads a support engineering team at Object Rocket. Public statements are not representative of $dayjob.
Fediverse/Mastodon -

Brian Peters: Virtual Data Optimizer (VDO) - Data Reduction for Block Storage

Introduction to Virtual Data Optimizer (VDO), an advanced storage technology for maximizing drive space. In this presentation we'll discuss use cases for VDO, advantages & disadvantages, and demo configuring & testing a drive using Virtual Data Optimizer.

About Brian:
Brian Peters, has been interested in technology since childhood. His first PC was a 486 clone that was upgraded many times over. His interest for Linux started with Ubuntu 5.10 (Breezy Badger), but has since found home with Debian. Brian is RHCSA certified and enjoys sharing his passion for FOSS with others.

On Technique

O'Reilly Radar - Tue, 2022/08/09 - 04:12

In a previous article, I wrote about how models like DALL-E and Imagen disassociate ideas from technique. In the past, if you had a good idea in any field, you could only realize that idea if you had the craftsmanship and technique to back it up. With DALL-E, that’s no longer true. You can say, “Make me a picture of a lion attacking a horse,” and it will happily generate one. Maybe not as good as the one that hangs in an art museum, but you don’t need to know anything about canvas, paints, and brushes, nor do you need to get your clothes covered with paint.

This raises some important questions, though. What is the connection between expertise and ideation? Does technique help you form ideas? (The Victorian artist William Morris is often quoted as saying “You can’t have art without resistance in the materials,” though he may only have been talking about his hatred of typewriters.) And what kinds of user interfaces will be effective for collaborations between humans and computers, where the computers supply the technique and we supply the ideas? Designing the prompts to get DALL-E to do something extraordinary requires a new kind of technique that’s very different from understanding pigments and brushes. What kinds of creativity does that new technique enable? How are these works different from what came before?

As interesting as it is to talk about art, there’s an area where these questions are more immediate. GitHub Copilot (based on a model named Codex, which is derived from GPT-3) generates code in a number of programming languages, based on comments that the user writes. Going in the other direction, GPT-3 has proven to be surprisingly good at explaining code. Copilot users still need to be programmers; they need to know whether the code that Copilot supplies is correct, and they need to know how to test it. The prompts themselves are really a sort of pseudo-code; even if the programmers don’t need to remember details of the language’s syntax or the names of library functions, they still need to think like programmers. But it’s obvious where this is trending. We need to ask ourselves how much “technique” we will ask of future programmers: in the 2030s or 2040s, will people just be able to tell some future Copilot what they want a program to be? More to the point, what sort of higher-order knowledge will future programmers need? Will they be able to focus more on the nature of what they want to accomplish, and less on the syntactic details of writing code?

It’s easy to imagine a lot of software professionals saying, “Of course you’ll have to know C. Or Java. Or Python. Or Scala.” But I don’t know if that’s true. We’ve been here before. In the 1950s, computers were programmed in machine language. (And before that, with cables and plugs.) It’s hard to imagine now, but the introduction of the first programming languages–Fortran, COBOL, and the like–was met with resistance from programmers who thought you needed to understand the machine. Now almost no one works in machine language or assembler. Machine language is reserved for a few people who need to work on some specialized areas of operating system internals, or who need to write some kinds of embedded systems code.

What would be necessary for another transformation? Tools like Copilot, useful as they may be, are nowhere near ready to take over. What capabilities will they need? At this point, programmers still have to decide whether or not code generated by Copilot is correct. We don’t (generally) have to decide whether the output of a C or Java compiler is correct, nor do we have to worry about whether, given the same source code, the compiler will generate identical output. Copilot doesn’t make that guarantee–and, even if it did, any change to the model (for example, to incorporate new StackOverflow questions or GitHub repositories) would be very likely to change its output. While we can certainly imagine compiling a program from a series of Copilot prompts, I can’t imagine a program that would be likely to stop working if it was recompiled without changes to the source code. Perhaps the only exception would be a library that could be developed once, then tested, verified, and used without modification–but the development process would have to re-start from ground zero whenever a bug or a security vulnerability was found. That wouldn’t be acceptable; we’ve never written programs that don’t have bugs, or that never need new features. A key principle behind much modern software development is minimizing the amount of code that has to change to fix bugs or add features.

It’s easy to think that programming is all about creating new code. It isn’t; one thing that every professional learns quickly is that most of the work goes into maintaining old code. A new generation of programming tools must take that into account, or we’ll be left in a weird situation where a tool like Copilot can be used to write new code, but programmers will still have to understand that code in detail because it can only be maintained by hand. (It is possible–even likely–that we will have AI-based tools that help programmers research software supply chains, discover vulnerabilities, and possibly even suggest fixes.) Writing about AI-generated art, Raphaël Millière says, “No prompt will produce the exact same result twice”; that may be desirable for artwork, but is destructive for programming. Stability and consistency is a requirement for next-generation programming tools; we can’t take a step backwards.

The need for greater stability might drive tools like Copilot from free-form English language prompts to some kind of more formal language. A book about prompt engineering for DALL-E already exists; in a way, that’s trying to reverse-engineer a formal language for generating images. A formal language for prompts is a move back in the direction of traditional programming, though possibly with a difference. Current programming languages are all about describing, step by step, what you want the computer to do in great detail. Over the years, we’ve gradually progressed to higher levels of abstraction. Could building a language model into a compiler facilitate the creation of a simpler language, one in which programmers just described what they wanted to do, and let the machine worry about the implementation, while providing guarantees of stability? Remember that it was possible to build applications with graphical interfaces, and for those applications to communicate about the Internet, before the Web. The Web (and, specifically, HTML) added a new formal language that encapsulated tasks that used to require programming.

Now let’s move up a level or two: from lines of code to functions, modules, libraries, and systems. Everyone I know who has worked with Copilot has said that, while you don’t need to remember the details of the programming libraries you’re using, you have to be even more aware of what you’re trying to accomplish. You have to know what you want to do; you have to have a design in mind. Copilot is good at low-level coding; does a programmer need to be in touch with the craft of low-level coding to think about the high-level design? Up until now that’s certainly been true, but largely out of necessity: you wouldn’t let someone design a large system who hasn’t built smaller systems. It is true (as Dave Thomas and Andy Hunt argued in The Pragmatic Programmer) that knowing different programming languages gives you different tools and approaches for solving problems.  Is the craft of software architecture different from the craft of programming?

We don’t really have a good language for describing software design. Attempts like UML have been partially successful at best. UML was both over- and under-specified, too precise and not precise enough; tools that generated source code scaffolding from UML diagrams exist, but aren’t commonly used these days. The scaffolding defined interfaces, classes, and methods that could then be implemented by programmers. While automatically generating the structure of a system sounds like a good idea, in practice it may have made things more difficult: if the high-level specification changed, so did the scaffolding, obsoleting any work that had been put into implementing with the scaffold. This is similar to the compiler’s stability problem, modulated into a different key. Is this an area where AI could help?

I suspect we still don’t want source code scaffolding, at least as UML envisioned it; that’s bound to change with any significant change in the system’s description. Stability will continue to be a problem. But it might be valuable to have a AI-based design tool that can take a verbal description of a system’s requirements, then generate some kind of design based on a large library of software systems–like Copilot, but at a higher level. Then the problem would be integrating that design with implementations of the design, some of which could be created (or at least suggested) by a system like Copilot. The problem we’re facing is that software development takes place on two levels: high level design and mid-level programming. Integrating the two is a hard problem that hasn’t been solved convincingly.  Can we imagine taking a high-level design, adding our descriptions to it, and going directly from the high-level design with mid-level details to an executable program? That programming environment would need the ability to partition a large project into smaller pieces, so teams of programmers could collaborate. It would need to allow changes to the high-level descriptions, without disrupting work on the objects and methods that implement those descriptions. It would need to be integrated with a version control system that is effective for the English-language descriptions as it is for lines of code. This wouldn’t be thinkable without guarantees of stability.

It was fashionable for a while to talk about programming as “craft.”  I think that fashion has waned, probably for the better; “code as craft” has always seemed a bit precious to me. But the idea of “craft” is still useful: it is important for us to think about how the craft may change, and how fundamental those changes can’t be. It’s clear that we are a long way from a world where only a few specialists need to know languages like C or Java or Python. But it’s also possible that developments like Copilot give us a glimpse of what the next step might be. Lamenting the state of programing tools, which haven’t changed much since the 1960s, Alan Kay wrote on Quora that “the next significant threshold that programming must achieve is for programs and programming systems to have a much deeper understanding of both what they are trying to do, and what they are actually doing.” A new craft of programming that is focused less on syntactic details, and more on understanding what the systems we are building are trying to accomplish, is the goal we should be aiming for.

Categories: Technology

Scaling False Peaks

O'Reilly Radar - Thu, 2022/08/04 - 04:12

Humans are notoriously poor at judging distances. There’s a tendency to underestimate, whether it’s the distance along a straight road with a clear run to the horizon or the distance across a valley. When ascending toward a summit, estimation is further confounded by false summits. What you thought was your goal and end point turns out to be a lower peak or simply a contour that, from lower down, looked like a peak. You thought you made it–or were at least close–but there’s still a long way to go.

The story of AI is a story of punctuated progress, but it is also the story of (many) false summits.

In the 1950s, machine translation of Russian into English was considered to be no more complex than dictionary lookups and templated phrases. Natural language processing has come a very long way since then, having burnt through a good few paradigms to get to something we can use on a daily basis. In the 1960s, Marvin Minsky and Seymour Papert proposed the Summer Vision Project for undergraduates: connect a TV camera to a computer and identify objects in the field of view. Computer vision is now something that is commodified for specific tasks, but it continues to be a work in progress and, worldwide, has taken more than a few summers (and AI winters) and many more than a few undergrads.

We can find many more examples across many more decades that reflect naiveté and optimism and–if we are honest–no small amount of ignorance and hubris. The two general lessons to be learned here are not that machine translation involves more than lookups and that computer vision involves more than edge detection, but that when we are confronted by complex problems in unfamiliar domains, we should be cautious of anything that looks simple at first sight, and that when we have successful solutions to a specific sliver of a complex domain, we should not assume those solutions are generalizable. This kind of humility is likely to deliver more meaningful progress and a more measured understanding of such progress. It is also likely to reduce the number of pundits in the future who mock past predictions and ambitions, along with the recurring irony of machine-learning experts who seem unable to learn from the past trends in their own field.

All of which brings us to DeepMind’s Gato and the claim that the summit of artificial general intelligence (AGI) is within reach. The hard work has been done and reaching AGI is now a simple matter of scaling. At best, this is a false summit on the right path; at worst, it’s a local maximum far from AGI, which lies along a very different route in a different range of architectures and thinking.

DeepMind’s Gato is an AI model that can be taught to carry out many different kinds of tasks based on a single transformer neural network. The 604 tasks Gato was trained on vary from playing Atari video games to chat, from navigating simulated 3D environments to following instructions, from captioning images to real-time, real-world robotics. The achievement of note is that it’s underpinned by a single model trained across all tasks rather than different models for different tasks and modalities. Learning how to ace Space Invaders does not interfere with or displace the ability to carry out a chat conversation.

Gato was intended to “test the hypothesis that training an agent which is generally capable on a large number of tasks is possible; and that this general agent can be adapted with little extra data to succeed at an even larger number of tasks.” In this, it succeeded. But how far can this success be generalized in terms of loftier ambitions? The tweet that provoked a wave of responses (this one included) came from DeepMind’s research director, Nando de Freitas: “It’s all about scale now! The game is over!”

The game in question is the quest for AGI, which is closer to what science fiction and the general public think of as AI than the narrower but applied, task-oriented, statistical approaches that constitute commercial machine learning (ML) in practice.

The claim is that AGI is now simply a matter of improving performance, both in hardware and software, and making models bigger, using more data and more kinds of data across more modes. Sure, there’s research work to be done, but now it’s all about turning the dials up to 11 and beyond and, voilà, we’ll have scaled the north face of the AGI to plant a flag on the summit.

It’s easy to get breathless at altitude.

When we look at other systems and scales, it’s easy to be drawn to superficial similarities in the small and project them into the large. For example, if we look at water swirling down a plughole and then out into the cosmos at spiral galaxies, we see a similar structure. But these spirals are more closely bound in our desire to see connection than they are in physics. In looking at scaling specific AI to AGI, it’s easy to focus on tasks as the basic unit of intelligence and ability. What we know of intelligence and learning systems in nature, however, suggests the relationships between tasks, intelligence, systems, and adaptation is more complex and more subtle. Simply scaling up one dimension of ability may simply scale up one dimension of ability without triggering emergent generalization.

If we look closely at software, society, physics or life, we see that scaling is usually accompanied by fundamental shifts in organizing principle and process. Each scaling of an existing approach is successful up to a point, beyond which a different approach is needed. You can run a small business using office tools, such as spreadsheets, and a social media page. Reaching Amazon-scale is not a matter of bigger spreadsheets and more pages. Large systems have radically different architectures and properties to either the smaller systems they are built from or the simpler systems that came before them.

It may be that artificial general intelligence is a far more significant challenge than taking task-based models and increasing data, speed, and number of tasks. We typically underappreciate how complex such systems are. We divide and simplify, make progress as a result, only to discover, as we push on, that the simplification was just that; a new model, paradigm, architecture, or schedule is needed to make further progress. Rinse and repeat. Put another way, just because you got to basecamp, what makes you think you can make the summit using the same approach? And what if you can’t see the summit? If you don’t know what you’re aiming for, it’s difficult to plot a course to it.

Instead of assuming the answer, we need to ask: How do we define AGI? Is AGI simply task-based AI for N tasks and a sufficiently large value of N? And, even if the answer to that question is yes, is the path to AGI necessarily task-centric? How much of AGI is performance? How much of AGI is big/bigger/biggest data?

When we look at life and existing learning systems, we learn that scale matters, but not in the sense suggested by a simple multiplier. It may well be that the trick to cracking AGI is to be found in scaling–but down rather than up.

Doing more with less looks to be more important than doing more with more. For example, the GPT-3 language model is based on a network of 175 billion parameters. The first version of DALL-E, the prompt-based image generator, used a 12-billion parameter version of GPT-3; the second, improved version used only 3.5 billion parameters. And then there’s Gato, which achieves its multitask, multimodal abilities with only 1.2 billion.

These reductions hint at the direction, but it’s not clear that Gato’s, GPT-3’s or any other contemporary architecture is necessarily the right vehicle to reach the destination. For example, how many training examples does it take to learn something? For biological systems, the answer is, in general, not many; for machine learning, the answer is, in general, very many. GPT-3, for example, developed its language model based on 45TB of text. Over a lifetime, a human reads and hears of the order of a billion words; a child is exposed to ten million or so before starting to talk. Mosquitoes can learn to avoid a particular pesticide after a single non-lethal exposure. When you learn a new game–whether video, sport, board or card–you generally only need to be told the rules and then play, perhaps with a game or two for practice and rule clarification, to make a reasonable go of it. Mastery, of course, takes far more practice and dedication, but general intelligence is not about mastery.

And when we look at the hardware and its needs, consider that while the brain is one of the most power-hungry organs of the human body, it still has a modest power consumption of around 12 watts. Over a life the brain will consume up to 10 MWh; training the GPT-3 language model took an estimated 1 GWh.

When we talk about scaling, the game is only just beginning.

While hardware and data matter, the architectures and processes that support general intelligence may be necessarily quite different to the architectures and processes that underpin current ML systems. Throwing faster hardware and all the world’s data at the problem is likely to see diminishing returns, although that may well let us scale a false summit from which we can see the real one.

Categories: Technology

The Metaverse Is Not a Place

O'Reilly Radar - Tue, 2022/08/02 - 11:38

The metaphors we use to describe new technology constrain how we think about it, and, like an out-of-date map, often lead us astray. So it is with the metaverse. Some people seem to think of it as a kind of real estate, complete with land grabs and the attempt to bring traffic to whatever bit of virtual property they’ve created.

Seen through the lens of the real estate metaphor, the metaverse becomes a natural successor not just to Second Life but to the World Wide Web and to social media feeds, which can be thought of as a set of places (sites) to visit. Virtual Reality headsets will make these places more immersive, we imagine.

But what if, instead of thinking of the metaverse as a set of interconnected virtual places, we think of it as a communications medium? Using this metaphor, we see the metaverse as a continuation of a line that passes through messaging and email to “rendezvous”-type social apps like Zoom, Google Meet, Microsoft Teams, and, for wide broadcast, Twitch + Discord. This is a progression from text to images to video, and from store-and-forward networks to real time (and, for broadcast, “stored time,” which is a useful way of thinking about recorded video), but in each case, the interactions are not place based but happening in the ether between two or more connected people. The occasion is more the point than the place.

In an interview with Lex Fridman, Mark Zuckerberg disclaimed the notion of the metaverse as a place, but in the same sentence described its future in a very place-based way:

A lot of people think that the Metaverse is about a place, but one definition of this is it’s about a time when basically immersive digital worlds become the primary way that we live our lives and spend our time.

Think how much more plausible this statement might be if it read:

A lot of people think that the Metaverse is about a place, but one definition of this is it’s about a time when immersive digital worlds become the primary way that we communicate and share digital experiences.

My personal metaverse prototype moment does not involve VR at all, but Zoom. My wife Jen and I join our friend Sabrina over Zoom each weekday morning to exercise together. Sabrina leads the sessions by sharing her Peloton app, which includes live and recorded exercise videos. Our favorites are the strength training videos with Rad Lopez and the 15-minute abs videos with Robin Arzón. We usually start with Rad and end with Robin, for a vigorous 45-minute workout.

Think about this for a moment: Jen and I are in our home. Sabrina is in hers. Rad and Robin recorded their video tracks from their studios on the other side of the county. Jen and Sabrina and I are there in real time. Rad and Robin are there in stored time. We have joined five people in four different places and three different times into one connected moment and one connected place, “the place between” the participants.

Sabrina also works out on her own on her Peloton bike, and that too has this shared quality, with multiple participants at various “thicknesses” of connection. While Jen and Sabrina and I are “enhancing” the sharing using real-time Zoom video, Sabrina’s “solo” bike workouts use the intrinsic sharing in the Peloton app, which lets participants see real-time stats from others doing the same ride.

This is the true internet—the network of networks, with dynamic interconnections. If the metaverse is to inherit that mantle, it has to have that same quality. Connection.

Hacker News user kibwen put it beautifully when they wrote:

A metaverse involves some kind of shared space and shared experience across a networked medium. Not only is it more than just doing things in VR, a metaverse doesn’t even require VR.

The metaverse as a vector

It’s useful to look at technology trends (lines of technology progression toward the future, and inheritance from the past) as vectors—quantities that can only be fully described by both a magnitude and a direction and that can be summed or multiplied to get a sense of how they might cancel, amplify, or redirect possible pathways to the future.

I wrote about this idea back in 2020, in a piece called “Welcome to the 21st Century,” in the context of using scenario planning to imagine the post-COVID future. It’s worth recapping here:

Once you’ve let loose your imagination, observe the world around you and watch for what scenario planners sometimes call “news from the future”—data points that tell you that the world is trending in the direction of one or another of your imagined scenarios. As with any scatter plot, data points are all over the map, but when you gather enough of them, you can start to see the trend line emerge.…

If you think of trends as vectors, new data points can be seen as extending and thickening the trend lines and showing whether they are accelerating or decelerating. And as you see how trend lines affect each other, or that new ones need to be added, you can continually update your scenarios (or as those familiar with Bayesian statistics might put it, you can revise your priors). This can be a relatively unconscious process. Once you’ve built mental models of the world as it might be, the news that you read will slot into place and either reinforce or dismantle your imagined future.

Here’s how my thinking about the metaverse was formed by “news from the future” accreting around a technology-development vector:

  1. I had a prior belief, going back decades, that the internet is a tool for connection and communication, and that advances along that vector will be important. I’m always looking with soft focus for evidence that the tools for connection and communication are getting richer, trying to understand how they are getting richer and how they are changing society. 
  2. I’ve been looking at VR for years, trying various headsets and experiences, but they are mostly solo and feel more like stand-alone games or if shared, awkward and cartoonish. Then I read a thoughtful piece by my friend Craig Mod in which he noted that while he lives his physical life in a small town in Japan or walking its ancient footpaths, he also has a work life in which he spends time daily with people all over the world. I believe he made the explicit connection to the metaverse, but neither he nor I can find the piece that planted this thought to confirm that. In any case, I think of Craig’s newsletter as where the notion that the metaverse is a continuation of the communications technologies of the internet took hold for me.
  3. I began to see the connection to Zoom when friends started using interesting backgrounds, some of which make them appear other than where they are and others that make clear just where they are. (For example, my friend Hermann uses as a background the beach behind his home in New Zealand, which is more vividly place based than his home office, which could be anywhere.) That then brought my exercise sessions with Sabrina and Jen into focus as part of this evolving story.
  4. I talked to Phil Libin about his brilliant service mmhmm, which makes it easy to create and deliver richer, more interactive presentations over Zoom and similar apps. The speaker literally gets to occupy the space of the presentation. Phil’s presentation on “The Out of Office World” was where it all clicked. He talks about the hierarchy of communication and the tools for modulating it. (IMO this is a must-watch piece for anyone thinking about the future of internet apps. I’m surprised how few people seem to have watched it.)
  1. Trying Supernatural using the Meta Quest 2 headset completed the connection between my experience using Zoom and Peloton for fitness with friends and the VR-dominant framing of the metaverse. Here I was, standing on the edge of one of the lava lakes at Erta Ale in Ethiopia, an astonishing volcano right out of central casting for Mount Doom in The Lord of the Rings, working through warm-up exercises with a video of a fitness instructor green-screened into the scene, before launching into a boxing training game. Coach Susie was present in stored time, just like Robin and Rad. All that was missing was Jen and Sabrina. I’m sure that such shared experiences in remarkable places are very much part of the VR future.

That kind of shared experience is central to Mark Zuckerberg’s vision of socializing in the metaverse.

In that video, Zuck shows off lavishly decorated personal spaces, photorealistic and cartoon avatars, and an online meeting interrupted by a live video call. He says:

It’s a ways off but you can start to see some of the fundamental building blocks take shape. First the feeling of presence. This is the defining quality of the metaverse. You’re going to really feel like you’re there with other people. You’ll see their facial expressions, you’ll see their body language, maybe figure out if they’re actually holding a winning hand—all the subtle ways that we communicate that today’s technology can’t quite deliver.

I totally buy the idea that presence is central. But Meta’s vision seems to miss the mark in its focus on avatars. Embedded video delivers more of that feeling of presence with far less effort on the part of the user than learning to create avatars that mimic our gestures and expressions.

Chris Milk, the CEO of Within, the company that created Supernatural, both agreed and disagreed about avatars when explaining the company’s origin story to me in a phone conversation a few months ago:

What we learned early on was that photorealism matters a lot in terms of establishing presence and human connection. Humans, captured using photorealistic methods like immersive video, allow for a deeper connection between the audience and the people recorded in the immersive VR experience. The audience feels present in the story with them. But it’s super hard to do from a technical standpoint and you give up a bunch of other things. The trade-off is that you can have photorealism but sacrifice interactivity, as the photorealistic humans need to be prerecorded. Alternatively, you can have lots of interactivity and human-to-human communication, but you give up on anyone looking real. In the latter, the humans need to be real-time-rendered avatars, and those, for the moment, don’t look remotely like real humans.

At the same time, Milk pointed out that humans are able to read a lot into even crude avatars, especially when they’re accompanied by real-time communication using voice.

Especially if it’s someone you already know, then the human connection can overcome a lot of missing visual realism. We did an experiment back in 2014 or 2015, probably. Aaron [Koblin, the cofounder of Within] was living in San Francisco, and I was in Los Angeles. We had built a VR prototype where we each had a block for the head and two blocks for our hands. I got into my headset in LA, and Aaron’s blocks were sitting over on the floor across from me as his headset and hand controllers were sitting on his floor in San Francisco. All of a sudden the three blocks jumped up off the ground into the air as he picked up his headset and put it on. The levitating cubes “walked” up to me, waved, and said, “Hey.” Immediately, before I even heard the voice, I recognized the person in those blocks as Aaron. I recognized through the posture and gait the spirit of Aaron in these three cubes moving through space. The resolution, or any shred of photorealism, was completely absent, but the humanity still showed through. And when his voice came out of them, my brain just totally accepted that the soul of Aaron now resides in these three floating cubes. Nothing was awkward about communicating back and forth. My brain just accepted it instantly.

And that’s where we get back to vectors. Understanding the future of photorealism in the metaverse depends on the speed and direction of progress in AI. In many ways, a photorealistic avatar is a kind of deepfake, and we know how computationally expensive their creation is today. How long will it be before the creation of deepfakes is cheap enough and fast enough that hundreds of millions of people can be creating and using them in real time? I suspect it will be a while.

Mmhmm’s blending of video and virtual works really well, using today’s technology. It’s ironic that in Meta’s video about the future, video is only shown on a screen in the virtual space rather than as an integral part of it. Meta could learn a lot from mmhmm.

On the other hand, creating a vast library of immersive 3D still images of amazing places into which either avatars or green-screened video images can be inserted seems much closer to realization. It’s still hard, but the problem is orders of magnitude smaller. The virtual spaces offered by Supernatural and other VR developers give an amazing taste of what’s possible here.

In this regard, an interesting sidenote came from a virtual session that we held earlier this year at the Social Science Foo Camp (an event put together annually by O’Reilly, Meta, and Sage) using the ENGAGE virtual media conferencing app. The group began their discussion in one of the default meeting spaces, but one of the attendees, Adam Flaherty, proposed that they have it in a more appropriate place. They moved to a beautifully rendered version of Oxford’s Bodleian Library, and attendees reported that the entire tenor of the conversation changed.

Two other areas worth thinking about:

  1. Social media evolved from a platform for real-time interaction (real-time status updates, forums, conversations, and groups) to one that’s often dominated by stored-time interaction (posts, stories, reels, et al). Innovation in formats for stored-time communications is at the heart of future social media competition, as TikTok has so forcefully reminded Facebook. There’s a real opportunity for developers and influencers to pioneer new formats as the metaverse unfolds.
  2. Bots are likely to play a big role in the metaverse, just as they do in today’s gaming environments. Will we be able to distinguish bots from humans? Chris Hecker’s indie game SpyParty, prototyped in 2009, made this a central feature of its game play, requiring two human players (one spy and one sniper) to find or evade each other among a party crowded with bots (what game developers call non-player characters or NPCs). Bots and deepfakes are already transforming our social experiences on the internet; expect this to happen on steroids in the metaverse. Some bots will be helpful, but others will be malevolent and disruptive. We will need to tell the difference.
The need for interoperability

There’s one thing that a focus on communications as the heart of the metaverse story reminds us: communication, above all, depends on interoperability. A balkanized metaverse in which a few big providers engage in a winner-takes-all competition to create the Meta- or Apple- or whatever-owned metaverse will take far longer to develop than one that allows developers to create great environments and experiences and connect them bit by bit with the innovations of others. It would be far better if the metaverse were an extension of the internet (“the network of networks”) rather than an attempt to replace it with a walled garden.

Some things that it would be great to have be interoperable:

  • Identity. We should be able to use the digital assets that represent who we are across platforms, apps, and places offered by different companies.
  • Sensors. Smartwatches, rings, and so forth are increasingly being used to collect physiological signals. This technology can be built into VR-specific headsets, but we would do better if it were easily shared between devices from different providers.
  • Places. (Yes, places are part of this after all.) Rather than having a single provider (say Meta) become the ur-repository of photorealistic 360-degree immersive spaces, it would be great to have an interoperability layer that allows their reuse.
  • Bot identification. Might NFTs end up becoming the basis for a nonrepudiable form of identity that must be produced by both humans and bots? (I suspect we can only force bots to identify themselves as such if we also require humans to do so.)
Foundations of the metaverse

You can continue this exercise by thinking about the metaverse as the combination of multiple technology trend vectors progressing at different speeds and coming from different directions, and pushing the overall vector forward (or backward) accordingly. No new technology is the product of a single vector.

So rather than settling on just “the metaverse is a communications medium,” think about the various technology vectors besides real-time communications that are coming together in the current moment. What news from the future might we be looking for?

  • Virtual Reality/Augmented Reality. Lighter and less obtrusive headsets. Advances in 3D video recording. Advances in sensors, including eye-tracking, expression recognition, physiological monitoring, even brain-control interfaces. Entrepreneurial innovations in the balance between AR and VR. (Why do we think of them as mutually exclusive rather than on a continuum?)
  • Social media. Innovations in connections between influencers and fans. How does stored time become more real time?
  • Gaming. Richer integration between games and communications. What’s the next Twitch + Discord?
  • AI. Not just deepfakes but the proliferation of AIs and bots as participants in social media and other communications. NPCs becoming a routine part of our online experience outside of gaming. Standards for identification of bots versus humans in online communities.
  • Cryptocurrencies and “Web3.” Does crypto/Web3 provide new business models for the metaverse? (BTW, I enjoyed the way that Neal Stephenson, in Reamde, had his character design the business model and money flows for his online game before he designed anything else. Many startups just try to get users and assume the business model will follow, but that has led us down the dead end of advertising and surveillance capitalism.)
  • Identity. Most of today’s identity systems are centralized in one way or another, with identity supplied by a trusted provider or verifier. Web3 proponents, however, are exploring a variety of systems for decentralized “self-sovereign identity,” including Vitalik Buterin’s “soulbound tokens.” The vulnerability of crypto systems to Sybil attacks in the absence of verifiable identity is driving a lot of innovation in the identity space. Molly White’s skeptical survey of these various initiatives is a great overview of the problem and the difficulties in overcoming it. Gordon Brander’s “Soulbinding Like A State,” a riff on Molly White’s post and James C. Scott’s Seeing Like A State, provides a further warning: “Scott’s framework reveals…that the dangers of legibility are not related to the sovereignty of an ID. There are many reasons self-sovereignty is valuable, but the function of a self-sovereign identity is still to make the bearer legible. What’s measured gets managed. What’s legible gets controlled.” As is often the case, no perfect solution will be found, but society will adopt an imperfect solution by making trade-offs that are odious to some, very profitable to others, and that the great mass of users will passively accept.

There’s a lot more we ought to be watching. I’d love your thoughts in the comments.

Categories: Technology

Radar Trends to Watch: August 2022

O'Reilly Radar - Tue, 2022/08/02 - 04:18

The large model train keeps rolling on. This month, we’ve seen the release of Bloom, an open, large language model developed by the BigScience collaboration, the first public access to DALL-E (along with a guide to prompt engineering), a Copilot-like model for generating regular expressions from English-language prompts, and Simon Willison’s experiments using GPT-3 to explain JavaScript code.

On other fronts, NIST has released the first proposed standard for post-quantum cryptography (i.e., cryptography that can’t be broken by quantum computers). CRISPR has been used in human trials to re-engineer a patient’s DNA to reduce cholesterol. And a surprising number of cities are paying high tech remote workers to move there.

Artificial Intelligence
  • Regardless of where a company is based, to avoid legal problems later, it’s a good idea to build AI and other data-based systems that observe the EU’s data laws.
  • Public (beta) access to DALL-E is beginning! It might take a while to get in because there are over a million on the waitlist. Accepted users get 50 free credits the first month, 15/month thereafter; a credit allows you to give one prompt, which returns 4 images. Users can buy additional credits.
  • Researchers have used reinforcement learning to build a robotic dog that learns to walk on its own in the real world (i.e., without prior training and use of a simulator).
  • Princeton held a workshop on the reproducibility crisis that the use of machine learning is causing in science. Evaluating the accuracy of results from machine learning is a problem that most scientific disciplines aren’t yet equipped to deal with.
  • Microsoft has revised its Responsible AI standard, making recommendations more concrete, particularly in the areas of accountability, transparency, fairness, safety, privacy, and inclusiveness. Microsoft also provides tools and resources to help developers build responsible AI systems.
  • The Dallery Gallery has published a Prompt Engineering Guide to DALL-E. (DALL-E is maintaining a waitlist for free trial accounts.)
  • Simon Willison has successfully used GPT-3 to explain how code works. It is amazingly good and, as Simon pointed out, works both on code that he understands, and code that he doesn’t.
  • Bloom, the open and transparent large language model developed by the BigScience group, is finished!  You can try it out, download it, and read its specifications. Unlike all other large language models, Bloom was developed in public, and is open to the public.
  • Radiologists outperform AI systems operating by themselves at detecting breast cancer from mammograms. However, a system designed to collaborate with radiologists in making decisions is better than either radiologists or AI alone. (The big question is whether these results hold up when taken to other hospitals.)
  • You liked Copilot? Try Autoregex: GPT-3 to generate regular expressions from natural language descriptions.
  • No Language Left Behind (NLLB) is a Meta AI project that translates text directly between any pair of over 200 languages. Benchmarks, training code, and models are all open source.
  • Democratic AI is an experiment in human-in-the-loop design that enables an AI system to design a social mechanism with human collaboration.
  • The Allen Institute, Microsoft, and others have developed a tool to measure the energy use and emissions generated by training AI models on Azure. They have found that emissions can be reduced substantially by training during periods when renewable power is at its peak.
  • Minerva is a large language model that Google has trained to solve quantitative reasoning (i.e., mathematics) problems, generating simple proofs in addition to answers. The problem domain extends through pre-calculus, including algebra and geometry, roughly at a high school level. Minerva has also been trained and tested in chemistry and physics.
  • Perhaps the scariest exploit in security would be a rootkit that cannot be detected or removed, even by wiping the disk and reinstalling the operating system. Such rootkits were recently discovered (one is named CosmicStrand); they have apparently been in the wild since 2016.
  • AWS is offering some customers a free multi factor authentication (MFA) security key.
  • Lost passwords are an important attack vector for industrial systems. A system is installed; the default password is changed; the person who changed the password leaves; the password is lost; the company installs password recovery software, which is often malware-infested, to recover the password.
  • A new technique for browser de-anonymization is based on correlating users’ activities on different websites.
  • Ransomware companies are now using search engines to allow their users to search the data they have stolen.
  • Ransomware doesn’t get as much attention in the news as it did last year, but in the past week one ransomware operation has shut down and released its decryptors, and two new ones (RedAlert and omega) have started.
  • Apple has added “lockdown mode” to iOS.  Lockdown mode provides an extreme degree of privacy; it is intended for people who believe they are being targeted by state-sponsored mercenary spyware.
  • The Open Source Security Mobilization Plan is an initiative that aims to address major areas of open source security, including education, risk assessment, digital signatures, memory safety, incident response, and software supply chain management.
  • Mitre has released their annual list of the 25 most dangerous software weaknesses (bugs, flaws, vulnerabilities).
  • Patches for the Log4J vulnerability were released back in February, 2022, but many organizations have not applied them, and remain vulnerable to attack.
  • Microsoft and Oracle have announced Oracle Data Service, which allows applications running on Azure to manage and use data in Oracle’s cloud. It’s a multicloud strategy that’s enabled by the cloud providers.
  • Google has announced a new programming language, Carbon, that is intended to be the successor to C++. One goal is complete interoperability between Carbon and existing C++ code and libraries.
  • How to save money on AWS Lambda: watch your memory!  Don’t over-allocate memory. This probably only applies to a few of your functions, but those functions are what drive the cost up.
  • SocialCyber is a DARPA program to understand the internals of open source software, along with the communities that create the software. They plan to use machine learning heavily, both to understand the code and to map and analyze communications within the communities. They are concerned about potential vulnerabilities in the software that the US military depends on.
  • WebAssembly in the cloud? Maybe it isn’t just a client-side technology. As language support grows, so do the kinds of applications Wasm can support.
  • A surveyreports that 62% of its respondents were only “somewhat confident” that open source software was “secure, up-to-date, and well-maintained.”  Disappointing as this may be, it’s actually an improvement over prior results.
  • Is low-code infrastructure as code the future of cloud operations?
  • Tiny Core Linux is amazingly small: a 22MB download, and runs in 48MB of RAM. As a consequence, it’s also amazingly fast. With a few exceptions, making things small has not been a trend over the past few years. We hope to see more of this.
  • Yet another JavaScript web framework? Fresh does server-side rendering, and is based on Deno rather than NodeJS.
  • Facebook is considering whether to rescind its bans on health misinformation. The pandemic is over, after all. Except that it isn’t. However, being a conduit for health misinformation is clearly profitable.
  • Priority Hints are a way for web developers to tell the browser which parts of the page are most important, so that they can be rendered quickly. They are currently supported by the Chrome and Edge browsers.
  • Hotwire, HTMX, and Unpoly are frameworks for building complex web applications while minimizing the need for complex Javascript. Are they an alternative to heavyweight JavaScript frameworks like React? Could a return to server-side web applications lead to a resurgence of platforms like Ruby on Rails?
  • Facebook has started encrypting the portions of URLs that are used to track users, preventing the Firefox and Brave browsers from stripping the tracking portion of the URL.
  • A priori censorship?  A popular cloud-based word processor in China has been observed censoring content upon the creation of a link for sharing the content. The document is locked; it cannot be edited or even opened by the author.
  • The Pirate Library Mirror is exactly what it says: a mirror of libraries of pirated books. It is focused on the preservation of human knowledge. There is no search engine, and it is only accessible by using BitTorrent over TOR.
  • Minecraft has decided that they will not “support or allow” the integration of NFTs into their virtual worlds. They object to “digital ownership based on scarcity and exclusion.”
  • Mixers are cryptocurrency services that randomize the currency you use; rather than pay with your own coin, you deposit money in a mixer and pay with randomly selected coins from other users. It’s similar to a traditional bank in that you never withdraw the same money you deposited.
  • So much for privacy. Coinbase, one of the largest cryptocurrency exchanges, sells geolocation data to ICE (the US Immigration and Customs Enforcement agency).
Quantum Computing
  • Quantum computers aren’t limited to binary: That limit is imposed by analogy to traditional computers, but some quantum computers have access to more state, and taking advantage of those states may make applications like simulating physical or biological systems easier.
  • Is quantum-aided computing for some industrial applications just around the corner? IonQ and GE have announced a results from a hybrid system for risk management. The quantum computer does random sampling from probability distributions, which are computationally expensive for classical computers; the rest of the computation is classical.
  • Quantum networking is becoming real: researchers have created entangled qubits via a 33-mile fiber optic connection. In addition to their importance for secure communications, quantum networks may be a crucial step in building quantum computers at scale.
  • NIST has announced four candidate algorithms for post-quantum cryptography. While it may be years before quantum computing can break current algorithms, many organizations are anxious to start the transition from current algorithms.
  • Not long ago (2020), DeepMind released AlphaFold, which used AI to solve protein folding problems. In 2021, they announced a public database containing the structure of a million proteins. With their latest additions, that database now contains the structure of over 200 million proteins, almost every protein known to science.
  • A motor made of DNA!  This nanoscale motor uses ideas from origami to fold DNA in a way that causes it to rotate when an electrical field is applied.
  • An electrode has been implanted into the brain of an ALS patient that will allow them to communicate thoughts via computer. The patient has otherwise lost the ability to move or speak.
  • Genetic editing with CRISPR was tested in a human to permanently lower LDL (“bad cholesterol”) levels. If this works, it could make heart attacks much rarer, and could be the first widespread use of CRISPR in humans.
Energy Work
  • Some cities (largely in the US South and Midwest) are giving cash bonuses to tech workers who are willing to move there and work remotely.
  • The FBI is warning employers that they are seeing an increasing number of fraudulent applications for remote work in which the application uses stolen personal information and deepfake imagery.
Categories: Technology

SQL: The Universal Solvent for REST APIs

O'Reilly Radar - Tue, 2022/07/19 - 04:16

Data scientists working in Python or R typically acquire data by way of REST APIs. Both environments provide libraries that help you make HTTP calls to REST endpoints, then transform JSON responses into dataframes. But that’s never as simple as we’d like. When you’re reading a lot of data from a REST API, you need to do it a page at a time, but pagination works differently from one API to the next. So does unpacking the resulting JSON structures. HTTP and JSON are low-level standards, and REST is a loosely-defined framework, but nothing guarantees absolute simplicity, never mind consistency across APIs.

What if there were a way of reading from APIs that abstracted all the low-level grunt work and worked the same way everywhere? Good news! That is exactly what Steampipe does. It’s a tool that translates REST API calls directly into SQL tables. Here are three examples of questions that you can ask and answer using Steampipe.

1. Twitter: What are recent tweets that mention PySpark?

Here’s a SQL query to ask that question:

select id, text from twitter_search_recent where query = 'pyspark' order by created_at desc limit 5;

Here’s the answer:

+---------------------+------------------------------------------------------------------------------------------------> | id | text > +---------------------+------------------------------------------------------------------------------------------------> | 1526351943249154050 | @dump Tenho trabalhando bastante com Spark, mas especificamente o PySpark. Vale a pena usar um > | 1526336147856687105 | RT @MitchellvRijkom: PySpark Tip ⚡ > | | > | | When to use what StorageLevel for Cache / Persist? > | | > | | StorageLevel decides how and where data should be s… > | 1526322757880848385 | Solve challenges and exceed expectations with a career as a AWS Pyspark Engineer.> | 1526318637485010944 | RT @JosMiguelMoya1: #pyspark #spark #BigData curso completo de Python y Spark con PySpark > | | > | | > | 1526318107228524545 | RT @money_personal: PySpark & AWS: Master Big Data With PySpark and AWS > | | #ApacheSpark #AWSDatabases #BigData #PySpark #100DaysofCode > | | -> http… > +---------------------+------------------------------------------------------------------------------------------------>

The table that’s being queried here, twitter_search_recent, receives the output from Twitter’s /2/tweets/search/recent endpoint and formulates it as a table with these columns. You don’t have to make an HTTP call to that API endpoint or unpack the results, you just write a SQL query that refers to the documented columns. One of those columns, query, is special: it encapsulates Twitter’s query syntax. Here, we are just looking for tweets that match PySpark but we could as easily refine the query by pinning it to specific users, URLs, types (is:retweet, is:reply), properties (has:mentions, has_media), etc. That query syntax is the same no matter how you’re accessing the API: from Python, from R, or from Steampipe. It’s plenty to think about, and all you should really need to know when crafting queries to mine Twitter data.

2. GitHub: What are repositories that mention PySpark?

Here’s a SQL query to ask that question:

select name, owner_login, stargazers_count from github_search_repository where query = 'pyspark' order by stargazers_count desc limit 10;

Here’s the answer:

+----------------------+-------------------+------------------+ | name | owner_login | stargazers_count | +----------------------+-------------------+------------------+ | SynapseML | microsoft | 3297 | | spark-nlp | JohnSnowLabs | 2725 | | incubator-linkis | apache | 2524 | | ibis | ibis-project | 1805 | | spark-py-notebooks | jadianes | 1455 | | petastorm | uber | 1423 | | awesome-spark | awesome-spark | 1314 | | sparkit-learn | lensacom | 1124 | | sparkmagic | jupyter-incubator | 1121 | | data-algorithms-book | mahmoudparsian | 1001 | +----------------------+-------------------+------------------+

This looks very similar to the first example! In this case, the table that’s being queried, github_search_repository, receives the output from GitHub’s /search/repositories endpoint and formulates it as a table with these columns.

In both cases the Steampipe documentation not only shows you the schemas that govern the mapped tables, it also gives examples (TwitterGitHub) of SQL queries that use the tables in various ways.

Note that these are just two of many available tables. The Twitter API is mapped to 7 tables, and the GitHub API is mapped to 41 tables.

3. Twitter + GitHub: What have owners of PySpark-related repositories tweeted lately?

To answer this question we need to consult two different APIs, then join their results. That’s even harder to do, in a consistent way, when you’re reasoning over REST payloads in Python or R. But this is the kind of thing SQL was born to do. Here’s one way to ask the question in SQL.

-- find pyspark repos with github_repos as ( select name, owner_login, stargazers_count from github_search_repository where query = 'pyspark' and name ~ 'pyspark' order by stargazers_count desc limit 50 ), -- find twitter handles of repo owners github_users as ( select u.login, u.twitter_username from github_user u join github_repos r on r.owner_login = u.login where u.twitter_username is not null ), -- find corresponding twitter users select id from twitter_user t join github_users g on t.username = g.twitter_username ) -- find tweets from those users select>>'username' as twitter_user, '' || (>>'username') || '/status/' || as url, t.text from twitter_user_tweet t join twitter_userids u on t.user_id = where t.created_at > now()::date - interval '1 week' order by limit 5

Here is the answer:

+----------------+---------------------------------------------------------------+-------------------------------------> | twitter_user | url | text > +----------------+---------------------------------------------------------------+-------------------------------------> | idealoTech | | Are you able to find creative soluti> | | | > | | | Join our @codility Order #API Challe> | | | > | | | #idealolife #codility #php > | idealoTech | | Our #ProductDiscovery team at idealo> | | | > | | | Think you can solve it? 😎 > | | | ➡ https://t>/ | ioannides_alex | | RT @scikit_learn: scikit-learn 1.1 i> | | | What's new? You can check the releas> | | | > | | | pip install -U… > | andfanilo | | @edelynn_belle Thanks! Sometimes it > | andfanilo | | @juliafmorgado Good luck on the reco> | | | > | | | My advice: power through it + a dead> | | | > | | | I hated my first few short videos bu> | | | > | | | Looking forward to the video 🙂

When APIs frictionlessly become tables, you can devote your full attention to reasoning over the abstractions represented by those APIs. Larry Wall, the creator of Perl, famously said: “Easy things should be easy, hard things should be possible.” The first two examples are things that should be, and are, easy: each is just 10 lines of simple, straight-ahead SQL that requires no wizardry at all.

The third example is a harder thing. It would be hard in any programming language. But SQL makes it possible in several nice ways. The solution is made of concise stanzas (CTEs, Common Table Expressions) that form a pipeline. Each phase of the pipeline handles one clearly-defined piece of the problem. You can validate the output of each phase before proceeding to the next. And you can do all this with the most mature and widely-used grammar for selection, filtering, and recombination of data.

Do I have to use SQL?

No! If you like the idea of mapping APIs to tables, but you would rather reason over those tables in Python or R dataframes, then Steampipe can oblige. Under the covers it’s Postgres, enhanced with foreign data wrappers that handle the API-to-table transformation. Anything that can connect to Postgres can connect to Steampipe, including SQL drivers like Python’s psycopg2 and R’s RPostgres as well as business-intelligence tools like Metabase, Tableau, and PowerBI. So you can use Steampipe to frictionlessly consume APIs into dataframes, then reason over the data in Python or R.

But if you haven’t used SQL in this way before, it’s worth a look. Consider this comparison of SQL to Pandas from How to rewrite your SQL queries in Pandas.

SQLPandasselect * from airportsairportsselect * from airports limit 3airports.head(3)select id from airports where ident = ‘KLAX’airports[airports.ident == ‘KLAX’].idselect distinct type from airportairports.type.unique()select * from airports where iso_region = ‘US-CA’ and type = ‘seaplane_base’airports[(airports.iso_region == ‘US-CA’) & (airports.type == ‘seaplane_base’)]select ident, name, municipality from airports where iso_region = ‘US-CA’ and type = ‘large_airport’airports[(airports.iso_region == ‘US-CA’) & (airports.type == ‘large_airport’)][[‘ident’, ‘name’, ‘municipality’]]

We can argue the merits of one style versus the other, but there’s no question that SQL is the most universal and widely-implemented way to express these operations on data. So no, you don’t have to use SQL to its fullest potential in order to benefit from Steampipe. But you might find that you want to.

Categories: Technology

Artificial Creativity?

O'Reilly Radar - Tue, 2022/07/12 - 06:24

There’s a puzzling disconnect in the many articles I read about DALL-E 2, Imagen, and the other increasingly powerful tools I see for generating images from textual descriptions. It’s common to read articles that talk about AI having creativity–but I don’t think that’s the case at all.  As with the discussion of sentience, authors are being misled by a very human will to believe. And in being misled, they’re missing out on what’s important.

It’s impressive to see AI-generated pictures of an astronaut riding a horse, or a dog riding a bike in Times Square. But where’s the creativity?  Is it in the prompt or in the product?  I couldn’t draw a picture of a dog riding a bike; I’m not that good an artist. Given a few pictures of dogs, Times Square, and whatnot, I could probably photoshop my way into something passable, but not very good.  (To be clear: these AI systems are not automating photoshop.) So the AI is doing something that many, perhaps most humans, wouldn’t be able to do. That’s important. Very few humans (if any) can play Go at the level of AlphaGo. We’re getting used to being second-best.

However, a computer replacing a human’s limited photoshop skills isn’t creativity. It took a human to say “create a picture of a dog riding a bike.” An AI couldn’t do that of its own volition. That’s creativity. But before writing off the creation of the picture, let’s think more about what that really means. Works of art really have two sources: the idea itself and the technique required to instantiate that idea. You can have all the ideas you want, but if you can’t paint like Rembrandt, you’ll never generate a Dutch master. Throughout history, painters have learned technique by copying the works of masters. What’s interesting about DALL-E, Imagen, and their relatives is that they supply the technique. Using DALL-E or Imagen, I could create a painting of a tarsier eating an anaconda without knowing how to paint.

That distinction strikes me as very important. In the 20th and 21st centuries we’ve become very impatient with technique. We haven’t become impatient with creating good ideas. (Or at least strange ideas.) The “age of mechanical reproduction” seems to have made technique less relevant; after all, we’re heirs of the poet Ezra Pound, who famously said, “Make it new.”

But does that quote mean what we think? Pound’s “Make it new” has been traced back to 18th century China, and from there to the 12th century, something that’s not at all surprising if you’re familiar with Pound’s fascination with Chinese literature. What’s interesting, though, is that Chinese art has always focused on technique to a level that’s almost inconceivable to the European tradition. And “Make it new” has, within it, the acknowledgment that what’s new first has to be made. Creativity and technique don’t come apart that easily.

We can see that in other art forms. Beethoven broke Classical music and put it back together again, but different-–he’s the most radical composer in the Western tradition (except for, perhaps, Thelonious Monk). And it’s worth asking how we get from what’s old to what’s new.  AI has been used to complete Beethoven’s 10th symphony, for which Beethoven left a number of sketches and notes at the time of his death. The result is pretty good, better than the human attempts I’ve heard at completing the 10th. It sounds Beethoven-like; its flaw is that it goes on and on, repeating Beethoven-like riffs but without the tremendous forward-moving force that you get in Beethoven’s compositions. But completing the 10th isn’t the problem we should be looking at. How did we get Beethoven in the first place?  If you trained an AI on the music Beethoven was trained on, would you eventually get the 9th symphony? Or would you get something that sounds a lot like Mozart and Haydn?

I’m betting the latter. The progress of art isn’t unlike the structure of scientific revolutions, and Beethoven indeed took everything that was known, broke it apart, and put it back together differently. Listen to the opening of Beethoven’s 9th symphony: what is happening? Where’s the theme? It sounds like the orchestra is tuning up. When the first theme finally arrives, it’s not the traditional “melody” that pre-Beethoven listeners would have expected, but something that dissolves back into the sound of instruments tuning, then gets reformed and reshaped. Mozart would never do this. Or listen again to Beethoven’s 5th symphony, probably the most familiar piece of orchestral music in the world. That opening duh-duh-duh-DAH–what kind of theme is that? Beethoven builds this movement by taking that four note fragment, moving it around, changing it, breaking it into even smaller bits and reassembling them. You can’t imagine a witty, urbane, polite composer like Haydn writing music like this. But I don’t want to worship some notion of Beethoven’s “genius” that privileges creativity over technique. Beethoven could never have gotten beyond Mozart and Haydn (with whom Beethoven studied) without extensive knowledge of the technique of composing; he would have had some good ideas, but he would never have known how to realize them. Conversely, the realization of radical ideas as actual works of art inevitably changes the technique. Beethoven did things that weren’t conceivable to Mozart or Haydn, and they changed the way music was written: those changes made the music of Schubert, Schumann, and Brahms possible, along with the rest of the 19th century.

That brings us back to the question of computers, creativity, and craft. Systems like DALL-E and Imagen break apart the idea and the technique, or the execution of the idea. Does that help us be more creative, or less? I could tell Imagen to “paint a picture of a 15th century woman with an enigmatic smile,” and after a few thousand tries I might get something like the Mona Lisa. I don’t think that anyone would care, really.  But this isn’t creating something new; it’s reproducing something old. If I magically appeared early in the 20th century, along with a computer capable of running Imagen (though only trained on art through 1900), would I be able to tell it to create a Picasso or a Dali? I have no idea how to do that. Nor do I have any idea what the next step for art is now, in the 21st century, or how I’d ask Imagen to create it. It sure isn’t Bored Apes. And if I could ask Imagen or DALL-E to create a painting from the 22nd century, how would that change the AI’s conception of technique?

At least part of what I lack is the technique, for technique isn’t just mechanical ability; it’s also the ability to think the way great artists do. And that gets us to the big question:

Now that we have abstracted technique away from the artistic process, can we build interfaces between the creators of ideas and the machines of technique in a way that allows the creators to “make it new”?  That’s what we really want from creativity: something that didn’t exist, and couldn’t have existed, before.

Can artificial intelligence help us to be creative? That’s the important question, and it’s a question about user interfaces, not about who has the biggest model.

Categories: Technology

Radar Trends to Watch: July 2022

O'Reilly Radar - Tue, 2022/07/05 - 04:09

This month, large models are even more in the news than last month: the open source Bloom model is almost finished, Google’s LaMDA is good enough that it can trick people into thinking it’s sentient, and DALL-E has gotten even better at drawing what you ask.

The most important issue facing technology might now be the protection of privacy. While that’s not a new concern, it’s a concern that most computer users have been willing to ignore, and that most technology companies have been willing to let them ignore. New state laws that criminalize having abortions out of state and the stockpiling of location information by antiabortion groups have made privacy an issue that can’t be ignored.

Artificial Intelligence
  • Big Science has almost finished training its open source BLOOM language model, which was developed by volunteer researchers and trained using public funds. Bloom will provide an open, public platform for research into the capabilities of large language models and, specifically,  issues like avoiding bias and toxic language.
  • AI tools like AlphaFold2 can create new proteins, not just analyze existing ones; the unexpected creation of new artifacts by an AI system is playfully called “hallucination.” The proteins designed so far probably aren’t useful; still, this is a major step forward in drug design.
  • Microsoft is limiting or removing access to some features in its face recognition service, Azure Face. Organizations will have to tell Microsoft how and why facial recognition will be used in their systems; and services like emotion recognition will be removed completely.
  • Amazon plans to give Alexa the ability to imitate anyone’s voice, using under a minute of audio. They give the example of a (possibly dead) grandmother “reading” a book to a child. Other AI vendors (most notably OpenAI/Microsoft) have considered such mimicry unethical.
  • Dolt is a SQL database that lets you version data using git commands, You can clone, push, pull, fork, branch, and merge just as with git; you access data using standard SQL.
  • It’s sadly unsurprising that a robot incorporating a widely-used neural network (OpenAI CLIP) learns racist and sexist biases, and that these biases affect its performance on tasks.
  • Building autonomous vehicles with memory, so that they can learn about objects on the routes they drive, may be an important step in making AV practical. In real life, most people drive over routes they are already familiar with. Autonomous vehicles should have the same advantage.
  • The argument about whether Google’s LaMDA is “sentient” continues, with a Google engineer placed on administrative leave for publishing transcripts of conversations that he claimed demonstrate sentience. Or are large language models just squirrels?
  • For artists working in collaboration with AI, the possibilities and imperfections of AI are a means of extending their creativity.
  • Pete Warden’s proposal for ML Sensors could make developing embedded ML systems much simpler: push the machine learning into the sensors themselves.
  • Researchers using DALL-E 2 discovered that the model has a “secret vocabulary” that’s not human language, but that can be used somewhat reliably to create consistent pictures. It may be an artifact of the model’s inability to say “I didn’t understand that”; given nonsense input, it is pulled towards similar words in the training corpus.
  • HuggingFace has made an agreement with Microsoft that will allow Azure customers to run HuggingFace language models on the Azure platform.
  • The startup Predibase has built a declarative low-code platform for building AI systems. In a declarative system, you describe the outcome you want, rather than the process for creating the outcome. The system figures out the process.
  • Researchers are developing AI models that implement metamemory: the ability to remember whether or not you know something.
  • As the population ages, it will be more important to diagnose diseases like Alzheimer’s early, when treatment is still meaningful. AI is providing tools to help doctors analyze MRI images more accurately than humans. These tools don’t attempt diagnosis; they provide data about brain features.
  • Google has banned the training of Deepfakes on Colab, its free Jupyter-based cloud programming platform.
  • Samsung and RedHat are working on new memory architectures and device drivers that will be adequate to the demands of a 3D-enabled, cloud-based metaverse.
  • The Metaverse Standards Forum is a new industry group with the goal of solving interoperability problems for the Metaverse. It views the Metaverse as the outgrowth of the Web, and plans to coordinate work between existing standards groups (like the W3C) relevant to the Metaverse.
  • Can the “Open Metaverse” be the future of the Internet?  The Open Metaverse Interoperability Group is building vendor-independent standards for social graphs, identities, and other elements of a Metaverse.
  • Holographic heads-up displays allow for 3D augmented reality: the ability to project 3D images onto the real world (for example, onto a car’s windshield).
  • Google’s Visual Position Service uses the data they’ve collected through Street View to provide high-accuracy positioning data for augmented reality applications. (This may be related to Niantic’s VPS, or they may just be using the same acronym.)
Security Programming
  • Amazon has launched CodeWhisperer, a direct competitor to GitHub Copilot.
  • Linus Torvalds predicts that Rust will be used in the Linux kernel by 2023.
  • GitHub Copilot is now generally available (for a price); it’s free to students and open source maintainers. Corporate licenses will be available later this year.
  • WebAssembly is making inroads. The universal WebAssembly runtime, Wasmer, runs any code, on any platform. Impressive, if it delivers.
  • Can WebAssembly replace Docker? Maybe, in some applications. WASM provides portability and eliminates some security issues (possibly introducing its own); Docker sets up environments.
  • Mozilla’s Project Bergamot is an automated translation tool designed for use on the Web. It can be used to build multilingual forms and other web pages. Unlike most other AI technologies, Bergamot runs in the browser using WASM. No data is sent to the cloud.
  • Microsoft has released a framework called Fluid for building collaborative apps, such as Slack, Discord, and Teams. Microsoft will also be releasing Azure Fluid Relay to support Fluid-based applications.
  • Dragonfly is a new in-memory database that claims significantly faster performance than memcached and Redis.
  • The Chinese government has blocked access to open source code on Gitee, the Chinese equivalent to GitHub, saying that all code must be reviewed by the government before it can be released to the public.
  • Is Blockchain Decentralized? A study commissioned by DARPA investigates whether a blockchain is truly immutable, or whether it can be modified without exploiting cryptographic vulnerabilities, but by attacking the blockchain’s implementation, networking, and consensus protocols. This is the most comprehensive examination of blockchain security that we’ve seen.
  • Jack Dorsey has announced that he’s working on Web5, which will be focused on identity management and be based on Bitcoin.
  • Molly White’s post questioning the possibility of acceptably non-dystopian self-sovereign identity is a must-read; she has an excellent summary and critique of just about all the work going on in the field.
  • Cryptographer Matthew Green makes an important argument for the technologies behind cryptocurrency (though not for the current implementations).
Biology Quantum Computing
  • Probabilistic computers, built from probabilistic bits (p-bits), may provide a significant step forward for probabilistic decision making. This sounds esoteric, but it’s essentially what we’re asking AI systems to do. P-bits may also be able to simulate q-bits and quantum computing.
  • A system that links two time crystals could be the basis for a new form of quantum computing. Time crystals can exist at room temperature, and remain coherent for much longer than existing qubit technologies.
Categories: Technology

2022 Cloud Salary Survey

O'Reilly Radar - Wed, 2022/06/22 - 04:21

Last year, our report on cloud adoption concluded that adoption was proceeding rapidly; almost all organizations are using cloud services. Those findings confirmed the results we got in 2020: everything was “up and to the right.” That’s probably still true—but saying “everything is still up and to the right” would be neither interesting nor informative. So rather than confirming the same results for a third year, we decided to do something different.

This year’s survey asked questions about compensation for “cloud professionals”: the software developers, operations staff, and others who build cloud-based applications, manage a cloud platform, and use cloud services. We limited the survey to residents of the United States because salaries from different countries aren’t directly comparable; in addition to fluctuating exchange rates, there are different norms for appropriate compensation. This survey ran from April 4 through April 15, 2022, and was publicized via email to recipients of our Infrastructure & Ops Newsletter whom we could identify as residing in the United States or whose location was unknown.

Executive Summary
  • Survey respondents earn an average salary of $182,000.
  • The average salary increase over the past year was 4.3%.
  • 20% of respondents reported changing employers in the past year.
  • 25% of respondents are planning to change employers because of compensation.
  • The average salary for women is 7% lower than the average salary for men.
  • 63% of respondents work remotely all the time; 94% work remotely at least one day a week.
  • Respondents who participated in 40 or more hours of training in the past year received higher salary increases.

Of the 1,408 responses we initially received, 468 were disqualified. Respondents were disqualified (and the survey terminated) if the respondent said they weren’t a US resident or if they were under 18 years old; respondents were also disqualified if they said they weren’t involved with their organization’s use of cloud services. Another 162 respondents filled out part of the survey but didn’t complete it; we chose to include only complete responses. That left us with 778 responses. Participants came from 43 states plus Washington, DC. As with our other surveys, the respondents were a relatively senior group: the average age was 47 years old, and while the largest number identified themselves as programmers (43%), 14% identified as executives and 33% as architects.

The Big Picture

Cloud professionals are well paid. That’s not a surprise in itself. We expected salaries (including bonuses) to be high, and they were. The cloud professionals who responded to our survey earn an average salary of $182,000; the most common salary range among respondents was $150,000 to $175,000 per year (16% of the total), as shown in Figure 1. The peak was fairly broad: 68% of the respondents earn between $100,000 and $225,000 per year. And there was a significant “long tail” in the compensation stratosphere: 7% of the respondents earn over $300,000 per year, and 2.4% over $400,000 per year.

Figure 1. Annual salary by percentage of respondents

We believe that job changes are part of what’s driving high salaries. After all, we’ve heard about talent shortages in almost every field, with many employers offering very high salaries to attract the staff they need. By staying with their current employer, an employee may get an annual salary increase of 4%. But if they change jobs, they might get a significantly higher offer—20% or more—plus a signing bonus.

20% of the respondents reported that they changed employers in the past year. That number isn’t high in and of itself, but it looks a lot higher when you add it to the 25% who are planning to leave jobs over compensation. (Another 20% of the respondents declined to answer this question.) It’s also indicative that 19% of the respondents received promotions. There was some overlap between those who received promotions and those who changed jobs (5% of the total said “yes” to both questions, or roughly one quarter of those who changed jobs). When you look at the number of respondents who left their employer, are planning to leave their employer, or got a promotion and a salary increase, it’s easy to see why salary budgets are under pressure. Right now, qualified candidates have the power in the job market, though with the stock market correction that began in March 2022 and significant layoffs from some large technology-sector companies, that may be changing.

These conclusions are borne out when you look at the salaries of those who were promoted, changed jobs, or intend to change jobs. A promotion roughly doubled respondents’ year-over-year salary increase. On the average, those who were promoted received a 7% raise; those who weren’t promoted received a 3.7% increase. The result was almost exactly the same for those who changed jobs: those who changed averaged a 6.8% salary increase, while those who remained averaged 3.7%. We also see a difference in the salaries of those who intend to leave because of compensation: their average salary is $171,000, as opposed to $188,000 for those who didn’t plan to leave. That’s a $17,000 difference, or roughly 10%.

Salaries by Gender

One goal of this survey was to determine whether women are being paid fairly. Last year’s salary survey for data and AI found a substantial difference between men’s and women’s salaries: women were paid 16% less than men. Would we see the same here?

The quick answer is “yes,” but the difference was smaller. Average salaries for women are 7% lower than for men ($172,000 as opposed to $185,000). But let’s take a step back before looking at salaries in more detail. We asked our respondents what pronouns they use. Only 8.5% said “she,” while 79% chose “he.” That’s still only 87% of the total. Where are the rest? 12% preferred not to say; this is a larger group than those who used “she.” 0.5% chose “other,” and 0.7% chose “they.” (That’s only four and six respondents, respectively.) Compared to results from our survey on the data/AI industry, the percentage of cloud professionals who self-identified as women appears to be much smaller (8.5%, as opposed to 14%). But there’s an important difference between the surveys: “I prefer not to answer” wasn’t an option for the Data/AI Salary Survey. We can’t do much with those responses. When we eyeballed the data for the “prefer not to say” group, we saw somewhat higher salaries than for women, but still significantly less (5% lower) than for men.

The difference between men’s and women’s salaries is smaller than we expected, given the results of last year’s Data/AI Salary Survey. But it’s still a real difference, and it begs the question: Is compensation improving for women? Talent shortages are driving compensation up in many segments of the software industry. Furthermore, the average reported salaries for both men and women in our survey are high. Again, is that a consequence of the talent shortage? Or is it an artifact of our sample, which appears to be somewhat older, and rich in executives? We can’t tell from a single year’s data, and the year-over-year comparison we made above is based on a different industry segment. But the evidence suggests that the salary gap is closing, and progress is being made. And that is indeed a good thing.

Salaries for respondents who answered “other” to the question about the pronouns they use are 31% lower than salaries for respondents who chose “he.” Likewise, salaries for respondents who chose “they” are 28% lower than men’s average salaries. However, both of these groups are extremely small, and in both groups, one or two individuals pulled the averages down. We could make the average salaries higher by calling these individuals “outliers” and removing their data; after all, outliers can have outsized effects on small groups. That’s a step we won’t take. Whatever the reason, the outliers are there; they’re part of the data. Professionals all across the spectrum have low-paying jobs—sometimes by choice, sometimes out of necessity. Why does there appear to be a concentration of them among people who don’t use “he” or “she” as their pronouns? The effect probably isn’t quite as strong as our data indicates, but we won’t try to explain our data away. It’s certainly indicative that the groups that use “they” or another pronoun than “he” or “she” showed a salary penalty. We have to conclude that respondents who use nonbinary pronouns earn lower salaries, but without more data, we don’t know why, nor do we know how much lower their salaries are or whether this difference would disappear with a larger sample.

To see more about the differences between men’s and women’s salaries, we looked at the men and women in each salary range. The overall shapes of the salary distributions are clear: a larger percentage of women earn salaries between $0 and $175,000, and (with two exceptions) a larger percentage of men earn salaries over $175,000. However, a slightly larger percentage of women earn supersize salaries ($400,000 or more), and a significantly larger percentage earn salaries between $225,000 and $250,000 (Figure 2).

Figure 2. Men’s and women’s salaries by percentage of respondents

We can get some additional information by looking at salary increases (Figure 3). On average, women’s salary increases were higher than men’s: $9,100 versus $8,100. That doesn’t look like a big difference, but it’s over 10%. We can read that as a sign that women’s salaries are certainly catching up. But the signals are mixed. Men’s salaries increased more than women’s in almost every segment, with two big exceptions: 12% of women received salary increases over $30,000, while only 8% of men did the same. Likewise, 17% of women received increases between $10,000 and $15,000, but only 9% of men did. These differences might well disappear with more data.

Figure 3. Salary increases for women and men by percentage of respondents

When we look at salary increases as a percentage of salary, we again see mixed results (Figure 4). Women’s salary increases were much larger than men’s in three bands: over $325,000 (with the exception of $375,000–$400,000, where there were no women respondents), $275,000–$300,000, and $150,000–$175,000. For those with very large salaries, women’s salary increases were much higher than men’s. Furthermore, the $150,000–$175,000 band had the largest number of women. While there was a lot of variability, salary increases are clearly an important factor driving women’s salaries toward parity with men’s.

Figure 4. Salary increases as a percentage of salary The Effect of Education

The difference between men’s and women’s salaries is significant at almost every educational level (Figure 5). The difference is particularly high for respondents who are self-taught, where women earned 39% less ($112,000 versus $184,000), and for students (45% less, $87,000 versus $158,000). However, those were relatively small groups, with only two women in each group. It’s more important that for respondents with bachelor’s degrees, women’s salaries were 4% higher than men’s ($184,000 versus $176,000)—and this was the largest group in our survey. For respondents with advanced degrees, women with doctorates averaged a 15% lower salary than men with equivalent education; women with master’s degrees averaged 10% lower. The difference between women’s and men’s salaries appears to be greatest at the extremes of the educational spectrum.

Figure 5. Men’s and women’s salaries by degree Salaries by State

Participants in the survey come from 43 states plus Washington, DC. Looking at salaries by state creates some interesting puzzles. The highest salaries are found in Oklahoma; South Dakota is third, following California. And the top of the list is an interesting mix of states where we expected high salaries (like New York) and states where we expected salaries to be lower. So what’s happening?

The average salary from Oklahoma is $225,000—but that only reflects two respondents, both of whom work remotely 100% of the time. (We’ll discuss remote work later in this report.) Do they work for a Silicon Valley company and get a Silicon Valley salary? We don’t know, but that’s certainly a possibility. The average salary for South Dakota is $212,000, but we shouldn’t call it an “average,” because we only had one response, and this respondent reported working remotely 1–4 days per week. Likewise, Vermont had a single respondent, who works remotely and who also had an above-average salary. Many other states have high average salaries but a very small number of respondents.

So the first conclusion that we can draw is that remote work might be making it possible for people in states without big technology industries to get high salaries. Or it could be the opposite: there’s no state without some businesses using the cloud, and the possibility of remote work puts employers in those states in direct competition with Silicon Valley salaries: they need to pay much higher salaries to get the expertise they need. And those job offers may include the opportunity to work remotely full or part time—even if the employer is local. Both of those possibilities no doubt hold true for individuals, if not for geographical regions as a whole.

Outliers aside, salaries are highest in California ($214,000), New York ($212,000), Washington ($203,000), Virginia ($195,000), and Illinois ($191,000). Massachusetts comes next at $189,000. At $183,000, average salaries in Texas are lower than we’d expect, but they’re still slightly above the national average ($182,000). States with high average salaries tended to have the largest numbers of respondents—with the important exceptions that we’ve already noted. The lowest salaries are found in West Virginia ($87,000) and New Mexico ($84,000), but these reflected a small number of respondents (one and four, respectively). These two states aside, the average salary in every state was over $120,000 (Figure 6).

So, is remote work equalizing salaries between different geographical regions? It’s still too early to say. We don’t think there will be a mass exodus from high-salary states to more rural states, but it’s clear that professionals who want to make that transition can, and that companies that aren’t in high-salary regions will need to offer salaries that compete in the nationwide market. Future surveys will tell us whether this pattern holds true.

Figure 6. Average salary by state Salaries by Age

The largest group of respondents to our survey were between 45 and 54 years old (Figure 7). This group also had the highest average salary ($196,000). Salaries for respondents between 55 and 65 years old were lower (averaging $173,000), and salaries dropped even more for respondents over 65 ($139,000). Salaries for the 18- to 24-year-old age range were low, averaging $87,000. These lower salaries are no surprise because this group includes both students and those starting their first jobs after college.

It’s worth noting that our respondents were older than we expected; 29% were between 35 and 44 years old, 36% were between 45 and 54, and 22% were between 55 and 64. Data from our learning platform shows that this distribution isn’t indicative of the field as a whole, or of our audience. It may be an artifact of the survey itself. Are our newsletter readers older, or are older people more likely to respond to surveys? We don’t know.

Figure 7. Average salary by age

The drop in salaries after age 55 is surprising. Does seniority count for little? It’s easy to make hypotheses: Senior employees are less likely to change jobs, and we’ve seen that changing jobs drives higher salaries. But it’s also worth noting that AWS launched in 2002, roughly 20 years ago. People who are now 45 to 54 years old started their careers in the first years of Amazon’s rollout. They “grew up” with the cloud; they’re the real cloud natives, and that appears to be worth something in today’s market.

Job Titles and Roles

Job titles are problematic. There’s no standardized naming system, so a programming lead at one company might be an architect or even a CTO at another. So we ask about job titles at a fairly high level of abstraction. We offered respondents a choice of four “general” roles: executive, director, manager, or associate. We also allowed respondents to write in their own job titles; roughly half chose this option. The write-in titles were more descriptive and, as expected, inconsistent. We were able to group them into some significant clusters by looking for people whose write-in title used the words “engineer,” “programmer,” “developer,” “architect,” “consultant,” or “DevOps.” We also looked at two modifiers: “senior” and “lead.” There’s certainly room for overlap: someone could be a “senior DevOps engineer.” But in practice, overlap was small. (For example, no respondents used both “developer” and “architect” in a write-in job title.) There was no overlap between the titles submitted by respondents and the general titles we offered on the survey: our respondents had to choose one or the other.

So what did we see? As shown in Figure 8, the highest salaries go to those who classified themselves as directors ($235,000) or executives ($231,000). Salaries for architects, “leads,” and managers are on the next tier ($196,000, $190,000, and $188,000, respectively). People who identified as engineers earn slightly lower salaries ($175,000). Associates, a relatively junior category, earn an average of $140,000 per year. Those who used “programmer” in their job title are a puzzle. There were only three of them, which is a surprise in itself, and all have salaries in the $50,000 to $100,000 range (average $86,000). Consultants also did somewhat poorly, with an average salary of $129,000.

Those who identified as engineers (19%) made up the largest group of respondents, followed by associates (18%). Directors and managers each comprised 15% of the respondents. That might be a bias in our survey, since it’s difficult to believe that 30% of cloud professionals have directorial or managerial roles. (That fits the observation that our survey results may skew toward older participants.) Architects were less common (7%). And relatively few respondents identified themselves with the terms “DevOps” (2%), “consultant” (2%), or “developer” (2%). The small number of people who identify with DevOps is another puzzle. It’s often been claimed that the cloud makes operations teams unnecessary; “NoOps” shows up in discussions from time to time. But we’ve never believed that. Cloud deployments still have a significant operational component. While the cloud may allow a smaller group to oversee a huge number of virtual machines, managing those machines has become more complex—particularly with cloud orchestration tools like Kubernetes.

Figure 8. Average salary by job title

We also tried to understand what respondents are doing at work by asking about job roles, decoupling responsibilities from titles (Figure 9). So in another question, we asked respondents to choose between marketing, sales, product, executive, programmer, and architect roles, with no write-in option. Executives earn the highest salaries ($237,000) but were a relatively small group (14%). Architects are paid $188,000 per year on average; they were 33% of respondents. And for this question, respondents didn’t hesitate to identify as programmers: this group was the largest (43%), with salaries somewhat lower than architects ($163,000). This is roughly in agreement with the data we got from job titles. (And we should have asked about operations staff. Next year, perhaps.)

The remaining three groups—marketing, sales, and product—are relatively small. Only five respondents identified their role as marketing (0.6%), but they were paid well ($187,000). 1.5% of the respondents identified as sales, with an average salary of $186,000. And 8% of the respondents identified themselves with product, with a somewhat lower average salary of $162,000.

Figure 9. Average salary by role Working from Home

When we were planning this survey, we were very curious about where people worked. Many companies have moved to a fully remote work model (as O’Reilly has), and many more are taking a hybrid approach. But just how common is remote work? And what consequences does it have for the employees who work from home rather than in an office?

It turns out that remote work is surprisingly widespread (Figure 10). We found that only 6% of respondents answered no to the question “Do you work remotely?” More than half (63%) said that they work remotely all the time, and the remainder (31%) work remotely 1–4 days per week.

Working remotely is also associated with higher salaries: the average salary for people who work remotely 1–4 days a week is $188,000. It’s only slightly less ($184,000) for people who work remotely all the time. Salaries are sharply lower for people who never work remotely (average $131,000).

Figure 10. Salaries and remote work

Salary increases show roughly the same pattern (Figure 11). While salaries are slightly higher for respondents who occasionally work in the office, salary increases were higher for those who are completely remote: the average increase was $8,400 for those who are remote 100% of the time, while those who work from home 1–4 days per week only averaged a $7,800 salary increase. We suspect that given time, these two groups would balance out. Salary changes for those who never work remotely were sharply lower ($4,500).

Of all jobs in the computing industry, cloud computing is probably the most amenable to remote work. After all, you’re working with systems that are remote by definition. You’re not reliant on your own company’s data center. If the application crashes in the middle of the night, nobody will be rushing to the machine room to reboot the server. A laptop and a network connection are all you need.

Figure 11. Salary increases and remote work

We’re puzzled by the relatively low salaries and salary increases for those who never work remotely. While there were minor differences, as you’d expect, there were no “smoking guns”: no substantial differences in education or job titles or roles. Does this difference reflect old-school companies that don’t trust their staff to be productive at home? And do they pay correspondingly lower salaries? If so, they’d better be forewarned: it’s very easy for employees to change jobs in the current labor market.

As the pandemic wanes (if indeed it wanes—despite what people think, that’s not what the data shows), will companies stick with remote work or will they require employees to come back to the office? Some companies have already asked their employees to return. But we believe that the trend toward remote work will be hard, if not impossible, to reverse, especially in a job market where employers are competing for talent. Remote work certainly raises issues about onboarding new hires, training, group dynamics, and more. And it’s not without problems for the employees themselves: childcare, creating appropriate work spaces, etc. These challenges notwithstanding, it’s difficult to imagine people who have eliminated a lengthy commute from their lives going back to the office on a permanent basis.

Certifications and Training

Nearly half (48%) of our respondents participated in technical training or certification programs in the last year. 18% of them obtained one or more certifications, suggesting that 30% participated in training or some other form of professional development that wasn’t tied to a certification program.

The most common reasons for participating in training were learning new technologies (42%) and improving existing skills (40%). (Percentages are relative to the total number of respondents, which was 778.) 21% wanted to work on more interesting projects. The other possible responses were chosen less frequently: 9% of respondents wanted to move into a leadership role, and 12% were required to take training. Job security was an issue for 4% of the respondents, a very small minority. That’s consistent with our observation that employees have the upper hand in the labor market and are more concerned with advancement than with protecting their status quo.

Survey participants obtained a very broad range of certifications. We asked specifically about 11 cloud certifications that we identified as being particularly important. Most were specific to one of the three major cloud vendors: Microsoft Azure, Amazon Web Services, and Google Cloud. However, the number of people who obtained any specific certification was relatively small. The most popular certifications were AWS Certified Cloud Practitioner and Solutions Architect (both 4% of the total number of respondents). However, 8% of respondents answered “other” and provided a write-in answer. That’s 60 respondents—and we got 55 different write-ins. Obviously, there was very little duplication. The only submissions with multiple responses were CKA (Certified Kubernetes Administrator) and CKAD (Certified Kubernetes Application Developer). The range of training in this “other” group was extremely broad, spanning various forms of Agile training, security, machine learning, and beyond. Respondents were pursuing many vendor-specific certifications, and even academic degrees. (It’s worth noting that our 2021 Data/AI Salary Surveyreport also concluded that earning a certification for one of the major cloud providers was a useful tool for career advancement.)

Given the number of certifications that are available, this isn’t surprising. It’s somewhat more surprising that there isn’t any consensus on which certifications are most important. When we look at salaries, though, we see some signals…at least among the leading certifications. The largest salaries are associated with Google Cloud Certified Professional Cloud Architect ($231,000). People who earned this certification also received a substantial salary increase (7.1%). Those who obtained an AWS Certified Solutions Architect – Professional, AWS Certified Solutions Architect – Associate, or Microsoft Certified: Azure Solutions Architect Expert certification also earn very high salaries ($212,000, $201,000, and $202,000, respectively), although these three received smaller salary increases (4.6%, 4.4%, and 4.0%, respectively). Those who earned the CompTIA Cloud+ certification receive the lowest salary ($132,000) and got a relatively small salary increase (3.5%). The highest salary increase went to those who obtained the Google Cloud Certified Professional Cloud DevOps Engineer certification (9.7%), with salaries in the middle of the range ($175,000).

We can’t draw any conclusions about the salaries or salary increases corresponding to the many certifications listed among the “other” responses; most of those certifications only appeared once. But it seems clear that the largest salaries and salary increases go to those who are certified for one of the big three platforms: Google Cloud, AWS, and Microsoft Azure (Figures 12 and 13).

The salaries and salary increases for the two Google certifications are particularly impressive. Given that Google Cloud is the least widely used of the major platforms, and that the number of respondents for these certifications was relatively small, we suspect that talent proficient with Google’s tools and services is harder to find and drives the salaries up.

Figure 12. Average salary by certification Figure 13. Average salary increase by certification

Our survey respondents engaged in many different types of training. The most popular were watching videos and webinars (41%), reading books (39%), and reading blogs and industry articles (34%). 30% of the respondents took classes online. Given the pandemic, it isn’t at all surprising that only 1.7% took classes in person. 23% attended conferences, either online or in person. (We suspect that the majority attended online.) And 24% participated in company-offered training.

There’s surprisingly little difference between the average salaries associated with each type of learning. That’s partly because respondents were allowed to choose more than one response. But it’s also notable that the average salaries for most types of learning are lower than the average salary for the respondents as a whole. The average salary by type of learning ranges from $167,000 (in-person classes) to $184,000 (company-provided educational programs). These salaries are on the low side compared to the overall average of $182,000. Lower salaries may indicate that training is most attractive to people who want to get ahead in their field. This fits the observation that most of the people who participated in training did so to obtain new skills or to improve current ones. After all, to many companies “the cloud” is still relatively new, and they need to retrain their current workforces.

When we look at the time that respondents spent in training (Figure 14), we see that the largest group spent 20–39 hours in the past year (13% of all the respondents). 12% spent 40–59 hours; and 10% spent over 100 hours. No respondents reported spending 10–19 hours in training. (There were also relatively few in the 80–99 hour group, but we suspect that’s an artifact of “bucketing”: if you’ve taken 83 hours of training, you’re likely to think, “I don’t know how much time I spent in training, but it was a lot,” and choose 100+.) The largest salary increases went to those who spent 40–59 hours in training, followed by those who spent over 100 hours; the smallest salary increases, and the lowest salaries, went to those who only spent 1–9 hours in training. Managers take training into account when planning compensation, and those who skimp on training shortchange themselves.

Figure 14. Percentage salary increase by time spent in training The Cloud Providers

A survey of this type wouldn’t be complete without talking about the major cloud providers. There’s no really big news here (Figure 15). Amazon Web Services has the most users, at 72%, followed by Microsoft Azure (42%) and Google Cloud (31%). Compared to the cloud survey we did last year, it looks like Google Cloud and Azure have dropped slightly compared to AWS. But the changes aren’t large. Oracle’s cloud offering was surprisingly strong at 6%, and 4% of the respondents use IBM Cloud.

When we look at the biggest cloud providers that aren’t based in the US, we find that they’re still a relatively small component of cloud usage: 0.6% of respondents use Alibaba, while 0.3% use Tencent. Because there are so few users among our respondents, the percentages don’t mean much: a few more users, and we might see something completely different. That said, we expected to see more users working with Alibaba; it’s possible that tensions between the United States and China have made it a less attractive option.

20% of the respondents reported using a private cloud. While it’s not entirely clear what the term “private cloud” means—for some, it just means a traditional data center—almost all the private cloud users also reported using one of the major cloud providers. This isn’t surprising; private clouds make the most sense as part of a hybrid or multicloud strategy, where the private cloud holds data that must be kept on premises for security or compliance reasons.

6% of the respondents reported using a cloud provider that we didn’t list. These answers were almost entirely from minor cloud providers, which had only one or two users among the survey participants. And surprisingly, 4% of the respondents reported that they weren’t using any cloud provider.

Figure 15. Cloud provider usage by percentage of respondents

There’s little difference between the salaries reported by people using the major providers (Figure 16). Tencent stands out; the average salary for its users is $275,000. But there were so few Tencent users among the survey respondents that we don’t believe this average is meaningful. There appears to be a slight salary premium for users of Oracle ($206,000) and Google ($199,000); since these cloud providers aren’t as widely used, it’s easy to assume that organizations committed to them are willing to pay slightly more for specialized talent, a phenomenon we’ve observed elsewhere. Almost as a footnote, we see that the respondents who don’t use a cloud have significantly lower salaries ($142,000).

Figure 16. Average salary by cloud provider

Cloud providers offer many services, but their basic services fall into a few well-defined classes (Figure 17). 75% of the survey respondents reported using virtual instances (for example, AWS EC2), and 74% use bucket storage (for example, AWS S3). These are services that are offered by every cloud provider. Most respondents use an SQL database (59%). Somewhat smaller numbers reported using a NoSQL database (41%), often in conjunction with an SQL database. 49% use container orchestration services; 45% use “serverless,” which suggests that serverless is more popular than we’ve seen in our other recent surveys.

Only 11% reported using some kind of AutoML—again, a service that’s provided by all the major cloud providers, though under differing names. And again, we saw no significant differences in salary based on what services were in use. That makes perfect sense; you wouldn’t pay a carpenter more for using a hammer than for using a saw.

Figure 17. Basic cloud services usage by percentage of respondents The Work Environment

Salaries aside, what are cloud developers working with? What programming languages and tools are they using?


Python is the most widely used language (59% of respondents), followed by SQL (49%), JavaScript (45%), and Java (32%). It’s somewhat surprising that only a third of the respondents use Java, given that programming language surveys done by TIOBE and RedMonk almost always have Java, Python, and JavaScript in a near tie for first place. Java appears not to have adapted well to the cloud (Figure 18).

Salaries also follow a pattern that we’ve seen before. Although the top four languages are in high demand, they don’t command particularly high salaries: $187,000 for Python, $179,000 for SQL, $181,000 for JavaScript, and $188,000 for Java (Figure 19). These are all “table stakes” languages: they’re necessary and they’re what most programmers use on the job, but the programmers who use them don’t stand out. And despite the necessity, there’s a lot of talent available to fill these roles. As we saw in last year’s Data/AI Salary Survey report, expertise in Scala, Rust, or Go commands a higher salary ($211,000, $202,000, and $210,000, respectively). While the demand for these languages isn’t as high, there’s a lot less available expertise. Furthermore, fluency in any of these languages shows that a programmer has gone considerably beyond basic competence. They’ve done the work necessary to pick up additional skills.

Figure 18. Programming language usage by percentage of respondents

The lowest salaries were reported by respondents using PHP ($155,000). Salaries for C, C++, and C# are also surprisingly low ($170,000, $172,000, and $170,000, respectively); given the importance of C and C++ for software development in general and the importance of C# for the Microsoft world, we find it hard to understand why.

Almost all of the respondents use multiple languages. If we had to make a recommendation for someone who wanted to move into cloud development or operations, or for someone planning a cloud strategy from scratch, it would be simple: focus on SQL plus one of the other table stakes languages (Java, JavaScript, or Python). If you want to go further, pick one of the languages associated with the highest salaries. We think Scala is past its peak, but because of its strong connection to the Java ecosystem, Scala makes sense for Java programmers. For Pythonistas, we’d recommend choosing Go or Rust.

Figure 19. Average salary by programming language Operating Systems

We asked our survey participants which operating systems they used so we could test something we’ve heard from several people who hire software developers: Linux is a must. That appears to be the case: 80% of respondents use Linux (Figure 20). Even though Linux really hasn’t succeeded in the desktop market (sorry), it’s clearly the operating system for most software that runs in the cloud. If Linux isn’t a requirement, it’s awfully close.

67% of the respondents reported using macOS, but we suspect that’s mostly as a desktop or laptop operating system. Of the major providers, only AWS offers macOS virtual instances, and they’re not widely used. (Apple’s license only allows macOS to run on Apple hardware, and only AWS provides Apple servers.) 57% of the respondents reported using some version of Windows. While we suspect that Windows is also used primarily as a desktop or laptop operating system, Windows virtual instances are available from all the major providers, including Oracle and IBM.

Figure 20. Operating system usage by percentage of respondents Tools

We saw little variation in salary from tool to tool. This lack of variation makes sense. As we said above, we don’t expect a carpenter who uses a hammer to be paid more than a carpenter who uses a saw. To be a competent carpenter, you need to use both, along with levels, squares, and a host of other tools.

However, it is interesting to know what tools are commonly in use (Figure 21). There aren’t any real surprises. Docker is almost universal, used by 76% of the respondents. Kubernetes use is very widespread, by 61% of the respondents. Other components of the Kubernetes ecosystem didn’t fare as well: 27% of respondents reported using Helm, and 12% reported using Istio, which has been widely criticized for being too complex.

Alternatives to this core cluster of tools don’t appear to have much traction. 10% of the respondents reported using OpenShift, the IBM/Red Hat package that includes Kubernetes and other core components. Our respondents seem to prefer building their tooling environment themselves. Podman, an alternative to Docker and a component of OpenShift, is only used by 8% of the respondents. Unfortunately, we didn’t ask about Linkerd, which appears to be establishing itself as a service mesh that’s simpler to configure than Istio. However, it didn’t show up among the write-in responses, and the number of respondents who said “other” was relatively small (9%).

The HashiCorp tool set (Terraform, Consul, and Vault) appears to be more widely used: 41% of the respondents reported using Terraform, 17% use Vault, and 8% use Consul. However, don’t view these as alternatives to Kubernetes. Terraform is a tool for building and configuring cloud infrastructure, and Vault is a secure repository for secrets. Only Consul competes directly.

Figure 21. Tool usage by percentage of respondents The Biggest Impact

Finally, we asked the respondents what would have the biggest impact on compensation and promotion. The least common answer was “data tools” (6%). This segment of our audience clearly isn’t working directly with data science or AI—though we’d argue that might change as more machine learning applications reach production. “Programming languages” was second from the bottom. The lack of concern about programming languages reflects reality. While we observed higher salaries for respondents who used Scala, Rust, or Go, if you’re solidly grounded in the basics (like Python and SQL), you’re in good shape. There’s limited value in pursuing additional languages once you have the table stakes.

The largest number of respondents said that knowledge of “cloud and containers” would have the largest effect on compensation. Again, containers are table stakes, as we saw in the previous section. Automation, security, and machine learning were also highly rated (18%, 15%, and 16%, respectively). It’s not clear why machine learning was ranked highly but data tools wasn’t. Perhaps our respondents interpreted “data tools” as software like Excel, R, and pandas.

11% of the respondents wrote in an answer. As usual with write-ins, the submissions were scattered, and mostly singletons. However, many of the write-in answers pointed toward leadership and management skills. Taken all together, these varied responses add up to about 2% of the total respondents. Not a large number, but still a signal that some part of our audience is thinking seriously about IT leadership.

Confidence in the Future

“Cloud adoption is up and to the right”? No, we already told you we weren’t going to conclude that. Though it’s no doubt true; we don’t see cloud adoption slowing in the near future.

Salaries are high. That’s good for employees and difficult for employers. It’s common for staff to jump to another employer offering a higher salary and a generous signing bonus. The current stock market correction may put a damper on that trend. There are signs that Silicon Valley’s money supply is starting to dry up, in part because of higher interest rates but also because investors are nervous about how the online economy will respond to regulation, and impatient with startups whose business plan is to lose billions “buying” a market before they figure out how to make money. Higher interest rates and nervous investors could mean an end to skyrocketing salaries.

The gap between women’s and men’s salaries has narrowed, but it hasn’t closed. While we don’t have a direct comparison for the previous year, last year’s Data/AI Salary Surveyreport showed a 16% gap. In this survey, the gap has been cut to 7%, and women are receiving salary increases that are likely to close that gap even further. It’s anyone’s guess how this will play out in the future. Talent is in short supply, and that puts upward pressure on salaries. Next year, will we see women’s salaries on par with men’s? Or will the gap widen again when the talent shortage isn’t so acute?

While we aren’t surprised by the trend toward remote work, we are surprised at how widespread remote work has become: as we saw, only 10% of our survey respondents never work remotely, and almost two-thirds work remotely full time. Remote work may be easier for cloud professionals, because part of their job is inherently remote. However, after seeing these results, we’d predict similar numbers for other industry sectors. Remote work is here to stay.

Almost half of our survey respondents participated in some form of training in the past year. Training on the major cloud platforms (AWS, Azure, and Google Cloud) was associated with higher salaries. However, our participants also wrote in 55 “other” kinds of training and certifications, of which the most popular was CKA (Certified Kubernetes Administrator).

Let’s end by thinking a bit more about the most common answer to the question “What area do you feel will have the biggest impact on compensation and promotion in the next year?”: cloud and containers. Our first reaction is that this is a poorly phrased option; we should have just asked about containers. Perhaps that’s true, but there’s something deeper hidden in this answer. If you want to get ahead in cloud computing, learn more about the cloud. It’s tautological, but it also shows some real confidence in where the industry is heading. Cloud professionals may be looking for their next employer, but they aren’t looking to jump ship to the “next big thing.” Businesses aren’t jumping away from the cloud to “the next big thing” either; whether it’s AI, the “metaverse,” or something else, their next big thing will be built in the cloud. And containers are the building blocks of the cloud; they’re the foundation on which the future of cloud computing rests. Salaries are certainly “up and to the right,” and we don’t see demand for cloud-capable talent dropping any time in the near future.

Categories: Technology

“Sentience” is the Wrong Question

O'Reilly Radar - Tue, 2022/06/21 - 06:30

On June 6, Blake Lemoine, a Google engineer, was suspended by Google for disclosing a series of conversations he had with LaMDA, Google’s impressive large model, in violation of his NDA. Lemoine’s claim that LaMDA has achieved “sentience” was widely publicized–and criticized–by almost every AI expert. And it’s only two weeks after Nando deFreitas, tweeting about DeepMind’s new Gato model, claimed that artificial general intelligence is only a matter of scale. I’m with the experts; I think Lemoine was taken in by his own willingness to believe, and I believe DeFreitas is wrong about general intelligence. But I also think that “sentience” and “general intelligence” aren’t the questions we ought to be discussing.

The latest generation of models is good enough to convince some people that they are intelligent, and whether or not those people are deluding themselves is beside the point. What we should be talking about is what responsibility the researchers building those models have to the general public. I recognize Google’s right to require employees to sign an NDA; but when a technology has implications as potentially far-reaching as general intelligence, are they right to keep it under wraps?  Or, looking at the question from the other direction, will developing that technology in public breed misconceptions and panic where none is warranted?

Google is one of the three major actors driving AI forward, in addition to OpenAI and Facebook. These three have demonstrated different attitudes towards openness. Google communicates largely through academic papers and press releases; we see gaudy announcements of its accomplishments, but the number of people who can actually experiment with its models is extremely small. OpenAI is much the same, though it has also made it possible to test-drive models like GPT-2 and GPT-3, in addition to building new products on top of its APIs–GitHub Copilot is just one example. Facebook has open sourced its largest model, OPT-175B, along with several smaller pre-built models and a voluminous set of notes describing how OPT-175B was trained.

I want to look at these different versions of “openness” through the lens of the scientific method. (And I’m aware that this research really is a matter of engineering, not science.)  Very generally speaking, we ask three things of any new scientific advance:

  • It can reproduce past results. It’s not clear what this criterion means in this context; we don’t want an AI to reproduce the poems of Keats, for example. We would want a newer model to perform at least as well as an older model.
  • It can predict future phenomena. I interpret this as being able to produce new texts that are (as a minimum) convincing and readable. It’s clear that many AI models can accomplish this.
  • It is reproducible. Someone else can do the same experiment and get the same result. Cold fusion fails this test badly. What about large language models?

Because of their scale, large language models have a significant problem with reproducibility. You can download the source code for Facebook’s OPT-175B, but you won’t be able to train it yourself on any hardware you have access to. It’s too large even for universities and other research institutions. You still have to take Facebook’s word that it does what it says it does. 

This isn’t just a problem for AI. One of our authors from the 90s went from grad school to a professorship at Harvard, where he researched large-scale distributed computing. A few years after getting tenure, he left Harvard to join Google Research. Shortly after arriving at Google, he blogged that he was “working on problems that are orders of magnitude larger and more interesting than I can work on at any university.” That raises an important question: what can academic research mean when it can’t scale to the size of industrial processes? Who will have the ability to replicate research results on that scale? This isn’t just a problem for computer science; many recent experiments in high-energy physics require energies that can only be reached at the Large Hadron Collider (LHC). Do we trust results if there’s only one laboratory in the world where they can be reproduced?

That’s exactly the problem we have with large language models. OPT-175B can’t be reproduced at Harvard or MIT. It probably can’t even be reproduced by Google and OpenAI, even though they have sufficient computing resources. I would bet that OPT-175B is too closely tied to Facebook’s infrastructure (including custom hardware) to be reproduced on Google’s infrastructure. I would bet the same is true of LaMDA, GPT-3, and other very large models, if you take them out of the environment in which they were built.  If Google released the source code to LaMDA, Facebook would have trouble running it on its infrastructure. The same is true for GPT-3. 

So: what can “reproducibility” mean in a world where the infrastructure needed to reproduce important experiments can’t be reproduced?  The answer is to provide free access to outside researchers and early adopters, so they can ask their own questions and see the wide range of results. Because these models can only run on the infrastructure where they’re built, this access will have to be via public APIs.

There are lots of impressive examples of text produced by large language models. LaMDA’s are the best I’ve seen. But we also know that, for the most part, these examples are heavily cherry-picked. And there are many examples of failures, which are certainly also cherry-picked.  I’d argue that, if we want to build safe, usable systems, paying attention to the failures (cherry-picked or not) is more important than applauding the successes. Whether it’s sentient or not, we care more about a self-driving car crashing than about it navigating the streets of San Francisco safely at rush hour. That’s not just our (sentient) propensity for drama;  if you’re involved in the accident, one crash can ruin your day. If a natural language model has been trained not to produce racist output (and that’s still very much a research topic), its failures are more important than its successes. 

With that in mind, OpenAI has done well by allowing others to use GPT-3–initially, through a limited free trial program, and now, as a commercial product that customers access through APIs. While we may be legitimately concerned by GPT-3’s ability to generate pitches for conspiracy theories (or just plain marketing), at least we know those risks.  For all the useful output that GPT-3 creates (whether deceptive or not), we’ve also seen its errors. Nobody’s claiming that GPT-3 is sentient; we understand that its output is a function of its input, and that if you steer it in a certain direction, that’s the direction it takes. When GitHub Copilot (built from OpenAI Codex, which itself is built from GPT-3) was first released, I saw lots of speculation that it will cause programmers to lose their jobs. Now that we’ve seen Copilot, we understand that it’s a useful tool within its limitations, and discussions of job loss have dried up. 

Google hasn’t offered that kind of visibility for LaMDA. It’s irrelevant whether they’re concerned about intellectual property, liability for misuse, or inflaming public fear of AI. Without public experimentation with LaMDA, our attitudes towards its output–whether fearful or ecstatic–are based at least as much on fantasy as on reality. Whether or not we put appropriate safeguards in place, research done in the open, and the ability to play with (and even build products from) systems like GPT-3, have made us aware of the consequences of “deep fakes.” Those are realistic fears and concerns. With LaMDA, we can’t have realistic fears and concerns. We can only have imaginary ones–which are inevitably worse. In an area where reproducibility and experimentation are limited, allowing outsiders to experiment may be the best we can do. 

Categories: Technology

Closer to AGI?

O'Reilly Radar - Tue, 2022/06/07 - 04:09

DeepMind’s new model, Gato, has sparked a debate on whether artificial general intelligence (AGI) is nearer–almost at hand–just a matter of scale.  Gato is a model that can solve multiple unrelated problems: it can play a large number of different games, label images, chat, operate a robot, and more.  Not so many years ago, one problem with AI was that AI systems were only good at one thing. After IBM’s Deep Blue defeated Garry Kasparov in chess,  it was easy to say “But the ability to play chess isn’t really what we mean by intelligence.” A model that plays chess can’t also play space wars. That’s obviously no longer true; we can now have models capable of doing many different things. 600 things, in fact, and future models will no doubt do more.

So, are we on the verge of artificial general intelligence, as Nando de Frietas (research director at DeepMind) claims? That the only problem left is scale? I don’t think so.  It seems inappropriate to be talking about AGI when we don’t really have a good definition of “intelligence.” If we had AGI, how would we know it? We have a lot of vague notions about the Turing test, but in the final analysis, Turing wasn’t offering a definition of machine intelligence; he was probing the question of what human intelligence means.

Consciousness and intelligence seem to require some sort of agency.  An AI can’t choose what it wants to learn, neither can it say “I don’t want to play Go, I’d rather play Chess.” Now that we have computers that can do both, can they “want” to play one game or the other? One reason we know our children (and, for that matter, our pets) are intelligent and not just automatons is that they’re capable of disobeying. A child can refuse to do homework; a dog can refuse to sit. And that refusal is as important to intelligence as the ability to solve differential equations, or to play chess. Indeed, the path towards artificial intelligence is as much about teaching us what intelligence isn’t (as Turing knew) as it is about building an AGI.

Even if we accept that Gato is a huge step on the path towards AGI, and that scaling is the only problem that’s left, it is more than a bit problematic to think that scaling is a problem that’s easily solved. We don’t know how much power it took to train Gato, but GPT-3 required about 1.3 Gigawatt-hours: roughly 1/1000th the energy it takes to run the Large Hadron Collider for a year. Granted, Gato is much smaller than GPT-3, though it doesn’t work as well; Gato’s performance is generally inferior to that of single-function models. And granted, a lot can be done to optimize training (and DeepMind has done a lot of work on models that require less energy). But Gato has just over 600 capabilities, focusing on natural language processing, image classification, and game playing. These are only a few of many tasks an AGI will need to perform. How many tasks would a machine be able to perform to qualify as a “general intelligence”? Thousands?  Millions? Can those tasks even be enumerated? At some point, the project of training an artificial general intelligence sounds like something from Douglas Adams’ novel The Hitchhiker’s Guide to the Galaxy, in which the Earth is a computer designed by an AI called Deep Thought to answer the question “What is the question to which 42 is the answer?”

Building bigger and bigger models in hope of somehow achieving general intelligence may be an interesting research project, but AI may already have achieved a level of performance that suggests specialized training on top of existing foundation models will reap far more short term benefits. A foundation model trained to recognize images can be trained further to be part of a self-driving car, or to create generative art. A foundation model like GPT-3 trained to understand and speak human language can be trained more deeply to write computer code.

Yann LeCun posted a Twitter thread about general intelligence (consolidated on Facebook) stating some “simple facts.” First, LeCun says that there is no such thing as “general intelligence.” LeCun also says that “human level AI” is a useful goal–acknowledging that human intelligence itself is something less than the type of general intelligence sought for AI. All humans are specialized to some extent. I’m human; I’m arguably intelligent; I can play Chess and Go, but not Xiangqi (often called Chinese Chess) or Golf. I could presumably learn to play other games, but I don’t have to learn them all. I can also play the piano, but not the violin. I can speak a few languages. Some humans can speak dozens, but none of them speak every language.

There’s an important point about expertise hidden in here: we expect our AGIs to be “experts” (to beat top-level Chess and Go players), but as a human, I’m only fair at chess and poor at Go. Does human intelligence require expertise? (Hint: re-read Turing’s original paper about the Imitation Game, and check the computer’s answers.) And if so, what kind of expertise? Humans are capable of broad but limited expertise in many areas, combined with deep expertise in a small number of areas. So this argument is really about terminology: could Gato be a step towards human-level intelligence (limited expertise for a large number of tasks), but not general intelligence?

LeCun agrees that we are missing some “fundamental concepts,” and we don’t yet know what those fundamental concepts are. In short, we can’t adequately define intelligence. More specifically, though, he mentions that “a few others believe that symbol-based manipulation is necessary.” That’s an allusion to the debate (sometimes on Twitter) between LeCun and Gary Marcus, who has argued many times that combining deep learning with symbolic reasoning is the only way for AI to progress. (In his response to the Gato announcement, Marcus labels this school of thought “Alt-intelligence.”) That’s an important point: impressive as models like GPT-3 and GLaM are, they make a lot of mistakes. Sometimes those are simple mistakes of fact, such as when GPT-3 wrote an article about the United Methodist Church that got a number of basic facts wrong. Sometimes, the mistakes reveal a horrifying (or hilarious, they’re often the same) lack of what we call “common sense.” Would you sell your children for refusing to do their homework? (To give GPT-3 credit, it points out that selling your children is illegal in most countries, and that there are better forms of discipline.)

It’s not clear, at least to me, that these problems can be solved by “scale.” How much more text would you need to know that humans don’t, normally, sell their children? I can imagine “selling children” showing up in sarcastic or frustrated remarks by parents, along with texts discussing slavery. I suspect there are few texts out there that actually state that selling your children is a bad idea. Likewise, how much more text would you need to know that Methodist general conferences take place every four years, not annually? The general conference in question generated some press coverage, but not a lot; it’s reasonable to assume that GPT-3 had most of the facts that were available. What additional data would a large language model need to avoid making these mistakes? Minutes from prior conferences, documents about Methodist rules and procedures, and a few other things. As modern datasets go, it’s probably not very large; a few gigabytes, at most. But then the question becomes “How many specialized datasets would we need to train a general intelligence so that it’s accurate on any conceivable topic?”  Is that answer a million?  A billion?  What are all the things we might want to know about? Even if any single dataset is relatively small, we’ll soon find ourselves building the successor to Douglas Adams’ Deep Thought.

Scale isn’t going to help. But in that problem is, I think, a solution. If I were to build an artificial therapist bot, would I want a general language model?  Or would I want a language model that had some broad knowledge, but has received some special training to give it deep expertise in psychotherapy? Similarly, if I want a system that writes news articles about religious institutions, do I want a fully general intelligence? Or would it be preferable to train a general model with data specific to religious institutions? The latter seems preferable–and it’s certainly more similar to real-world human intelligence, which is broad, but with areas of deep specialization. Building such an intelligence is a problem we’re already on the road to solving, by using large “foundation models” with additional training to customize them for special purposes. GitHub’s Copilot is one such model; O’Reilly Answers is another.

If a “general AI” is no more than “a model that can do lots of different things,” do we really need it, or is it just an academic curiosity?  What’s clear is that we need better models for specific tasks. If the way forward is to build specialized models on top of foundation models, and if this process generalizes from language models like GPT-3 and O’Reilly Answers to other models for different kinds of tasks, then we have a different set of questions to answer. First, rather than trying to build a general intelligence by making an even bigger model, we should ask whether we can build a good foundation model that’s smaller, cheaper, and more easily distributed, perhaps as open source. Google has done some excellent work at reducing power consumption, though it remains huge, and Facebook has released their OPT model with an open source license. Does a foundation model actually require anything more than the ability to parse and create sentences that are grammatically correct and stylistically reasonable?  Second, we need to know how to specialize these models effectively.  We can obviously do that now, but I suspect that training these subsidiary models can be optimized. These specialized models might also incorporate symbolic manipulation, as Marcus suggests; for two of our examples, psychotherapy and religious institutions, symbolic manipulation would probably be essential. If we’re going to build an AI-driven therapy bot, I’d rather have a bot that can do that one thing well than a bot that makes mistakes that are much subtler than telling patients to commit suicide. I’d rather have a bot that can collaborate intelligently with humans than one that needs to be watched constantly to ensure that it doesn’t make any egregious mistakes.

We need the ability to combine models that perform different tasks, and we need the ability to interrogate those models about the results. For example, I can see the value of a chess model that included (or was integrated with) a language model that would enable it to answer questions like “What is the significance of Black’s 13th move in the 4th game of FischerFisher vs. Spassky?” Or “You’ve suggested Qc5, but what are the alternatives, and why didn’t you choose them?” Answering those questions doesn’t require a model with 600 different abilities. It requires two abilities: chess and language. Moreover, it requires the ability to explain why the AI rejected certain alternatives in its decision-making process. As far as I know, little has been done on this latter question, though the ability to expose other alternatives could be important in applications like medical diagnosis. “What solutions did you reject, and why did you reject them?” seems like important information we should be able to get from an AI, whether or not it’s “general.”

An AI that can answer those questions seems more relevant than an AI that can simply do a lot of different things.

Optimizing the specialization process is crucial because we’ve turned a technology question into an economic question. How many specialized models, like Copilot or O’Reilly Answers, can the world support? We’re no longer talking about a massive AGI that takes terawatt-hours to train, but about specialized training for a huge number of smaller models. A psychotherapy bot might be able to pay for itself–even though it would need the ability to retrain itself on current events, for example, to deal with patients who are anxious about, say, the invasion of Ukraine. (There is ongoing research on models that can incorporate new information as needed.) It’s not clear that a specialized bot for producing news articles about religious institutions would be economically viable. That’s the third question we need to answer about the future of AI: what kinds of economic models will work? Since AI models are essentially cobbling together answers from other sources that have their own licenses and business models, how will our future agents compensate the sources from which their content is derived? How should these models deal with issues like attribution and license compliance?

Finally, projects like Gato don’t help us understand how AI systems should collaborate with humans. Rather than just building bigger models, researchers and entrepreneurs need to be exploring different kinds of interaction between humans and AI. That question is out of scope for Gato, but it is something we need to address regardless of whether the future of artificial intelligence is general or narrow but deep. Most of our current AI systems are oracles: you give them a prompt, they produce an output.  Correct or incorrect, you get what you get, take it or leave it. Oracle interactions don’t take advantage of human expertise, and risk wasting human time on “obvious” answers, where the human says “I already know that; I don’t need an AI to tell me.”

There are some exceptions to the oracle model. Copilot places its suggestion in your code editor, and changes you make can be fed back into the engine to improve future suggestions. Midjourney, a platform for AI-generated art that is currently in closed beta, also incorporates a feedback loop.

In the next few years, we will inevitably rely more and more on machine learning and artificial intelligence. If that interaction is going to be productive, we will need a lot from AI. We will need interactions between humans and machines, a better understanding of how to train specialized models, the ability to distinguish between correlations and facts–and that’s only a start. Products like Copilot and O’Reilly Answers give a glimpse of what’s possible, but they’re only the first steps. AI has made dramatic progress in the last decade, but we won’t get the products we want and need merely by scaling. We need to learn to think differently.

Categories: Technology

Radar Trends to Watch: June 2022

O'Reilly Radar - Wed, 2022/06/01 - 04:54

The explosion of large models continues. Several developments are especially noteworthy. DeepMind’s Gato model is unique in that it’s a single model that’s trained for over 600 different tasks; whether or not it’s a step towards general intelligence (the ensuing debate may be more important than the model itself), it’s an impressive achievement. Google Brain’s Imagen creates photorealistic images that are impressive, even after you’ve seen what DALL-E 2 can do. And Allen AI’s Macaw (surely an allusion to Emily Bender and Timnit Gebru’s Stochastic Parrots paper) is open source, one tenth the size of GPT-3, and claims to be more accurate. Facebook/Meta is also releasing an open source large language model, including the model’s training log, which records in detail the work required to train it.

Artificial Intelligence
  • Is thinking of autonomous vehicles as AI systems rather than as robots the next step forward? A new wave of startups is trying techniques such as reinforcement learning to train AVs to drive safely.
  • Generative Flow Networks may be the next major step in building better AI systems.
  • The ethics of building AI bots that mimic real dead people seems like an academic question, until someone does it: using GPT-3, a developer created a bot based on his deceased fiancée. OpenAI objected, stating that building such a bot was a violation of its terms of service.
  • Cortical Labs and other startups are building computers that incorporate human neurons. It’s claimed that these systems can be trained to perform game-playing tasks significantly faster than traditional AI.
  • Google Brain has built a new text-to-image generator called Imagen that creates photorealistic images. Although images generated by projects like this are always cherry-picked, the image quality is impressive; the developers claim that it is better than DALL-E 2.
  • DeepMind has created a new “generalist” model called Gato. It is a single model that can solve many different kinds of tasks: playing multiple games, labeling images, and so on. It has prompted a debate on whether Artificial General Intelligence is simply a matter of scale.
  • AI in autonomous vehicles can be used to eliminate waiting at traffic lights, increase travel speed, and reduce fuel consumption and carbon emissions. Surprisingly, if only 25% of the vehicles are autonomous, you get 50% of the benefit.
  • Macaw is a language model developed by Allen AI (AI2). It is freely available and open-source. Macaw is 1/10th the size of GPT-3 and roughly 10% more accurate at answering questions, though (like GPT-3) it tends to fail at questions that require common sense or involve logical tricks.
  • Ai-da is an AI-driven robot that can paint portraits–but is it art? Art is as much about human perception as it is about creation. What social cues prompt us to think that a robot is being creative?
  • Facebook/Meta has created a large language model called OPT that is similar in size and performance to GPT-3. Using the model is free for non-commercial work; the code is being released open source, along with documents describing how the model was trained.
  • Alice is a modular and extensible open source virtual assistant (think Alexa) that can run completely offline. It is private by default, though it can be configured to use Amazon or Google as backups. Alice can identify different users (for whom it can develop “likes” or “dislikes,” based on interactions).
  • High volume event streaming without a message queue: Palo Alto Networks has built a system for processing terabytes of security events per day without using a message queue, just a NoSQL database.
  • New tools allow workflow management across groups of spreadsheets. Spreadsheets are the original “low code”; these tools seem to offer spreadsheet users many of the features that software developers get from tools like git.
  • Portainer is a container management tool that lets you mount Docker containers as persistent filesystems.
  • NVIDIA has open-sourced its Linux device drivers. The code is available on GitHub. This is a significant change for a company that historically has avoided open source.
  • A startup named Buoyant is building tools to automate management of Linkerd. Linkerd, in turn, is a service mesh that is easier to manage and more appropriate for small to medium businesses, than Istio.
  • Are we entering the “third age of JavaScript”? An intriguing article suggests that we are. In this view of the future, static site generation disappears, incremental rendering and edge routing become more important, and Next.js becomes a dominant platform.
  • Rowy is a low-code programming environment that intends to escape the limitations of Airtable and other low-code collaboration services. The interface is like a spreadsheet, but it’s built on top of the Google Cloud Firestore document database.
  • PyScript is framework for running Python in the browser, mixed with HTML (in some ways, not unlike PHP). It is based on Pyodide (a WASM implementation of Python), integrates well with JavaScript, and might support other languages in the future.
  • Machine learning raises the possibility of undetectable backdoor attacks, malicious attacks that can affect the output of a model but don’t measurably detect its performance. Security issues for machine learning aren’t well understood, and aren’t getting a lot of attention.
  • In a new supply chain attack, two widely used libraries (Python’s ctx and PHP’s PHPass) have been compromised to steal AWS credentials. The attacker now claims that these exploits were “ethical research,” possibly with the goal of winning bounties for reporting exploits.
  • While it is not yet accurate enough to work in practice, a new method for detecting cyber attacks can detect and stop attacks in under one second.
  • The Eternity Project is a new malware-as-a-service organization that offers many different kinds of tools for data theft, ransomware, and many other exploits. It’s possible that the project is itself a scam, but it appears to be genuine.
  • Palo Alto Networks has published a study showing that most cloud identity and access management policies are too permissive, and that 90% of the permissions granted are never used. Overly-permissive policies are a major vulnerability for cloud users.
  • NIST has just published a massive guide to supply chain security. For organizations that can’t digest this 326-page document, they plan to publish a quick-start guide.
  • The Passkey standard, supported by Google, Apple, and Microsoft, replaces passwords with other forms of authentication. An application makes an authentication request to the device, which can then respond using any authentication method it supports. Passkey is operating system-independent, and supports both Bluetooth in addition to Internet protocols.
  • Google and Mandiant both report significant year-over-year increases in the number of 0-day vulnerabilities discovered in 2021.
  • Interesting statistics about ransomware attacks: The ransom is usually only 15% of the total cost of the attack; and on average, the ransom is 2.8% of net revenue (with discounts of up to 25% for prompt payment).
  • Bugs in the most widely used ransomware software, including REvil and Conti, can be used to prevent the attacker from encrypting your data.
Web and Web3 VR/AR/Metaverse
  • Niantic is building VPS (Visual Positioning System), an augmented reality map of the world, as part of its Lightship platform. VPS allows games and other AR products to be grounded to the physical world.
  • LivingCities is building a digital twin of the real world as a platform for experiencing the world in extended reality. That experience includes history, a place’s textures and feelings, and, of course, a new kind of social media.
  • New research in haptics allows the creation of realistic virtual textures by measuring how people feel things. Humans are extremely sensitive to the textures of materials, so creating good textures is important for everything from video games to telesurgery.
  • Google is upgrading its search engine for augmented reality: they are integrating images more fully into searches, creating multi-modal searches that incorporate images, text, and audio, and generating search results that can be explored through AR.
  • BabylonJS is an open source 3D engine, based on WebGL and WebGPU, that Microsoft developed. It is a strong hint that Microsoft’s version of the Metaverse will be web-based. It will support WebXR.
  • The fediverse is an ensemble of microblogging social media sites (such as Mastodon) that communicate with each other. Will they become a viable alternative to Elon Musk’s Twitter?
  • Varjo is building a “reality cloud”: a 3D mixed reality streaming service that allows photorealistic “virtual teleportation.” It’s not about weird avatars in a fake 3D world; they record your actions in your actual environment.
Hardware Design
  • Ethical design starts with a redefinition of success: well-being, equity, and sustainability, with good metrics for measuring your progress.
Quantum Computing
  • QICK is a new standardized control plane for quantum devices. The design of the control plane, including software, is all open source. A large part of the cost of building a quantum device is building the electronics to control it. QICK will greatly reduce the cost of quantum experimentation.
  • Researchers have built logical gates using error-corrected quantum bits. This is a significant step towards building a useful quantum computer.
Categories: Technology
Subscribe to LuftHans aggregator