You are here

Technology

Four short links: 17 August 2018

O'Reilly Radar - Fri, 2018/08/17 - 03:55

LED Patterns, System Change, Evented I/O, and Programmer Workflow

  1. Pixelblaze -- an advanced LED pattern-development engine and controller. It makes it fast and fun to write new patterns with its web-based live editor and highly optimized expression engine.
  2. Places to Intervene in a System -- (in increasing order of effectiveness) 9. Constants, parameters, numbers (subsidies, taxes, standards). 8. Regulating negative feedback loops. 7. Driving positive feedback loops. 6. Material flows and nodes of material intersection. 5. Information flows. 4. The rules of the system (incentives, punishments, constraints). 3. The distribution of power over the rules of the system. 2. The goals of the system. 1. The mindset or paradigm out of which the system—its goals, power structure, rules, its culture—arises.
  3. libuv Book -- a small set of tutorials about using libuv as a high-performance evented I/O library that offers the same API on Windows and Unix.
  4. LEO Editor -- a PIM, IDE, and outliner that accelerates the workflow of programmers, authors, and web designers. Outline nodes may appear in more than one place, allowing multiple organizations of data within a single outline.

Continue reading Four short links: 17 August 2018.

Categories: Technology

Simplifying machine learning lifecycle management

O'Reilly Radar - Thu, 2018/08/16 - 04:25

The O’Reilly Data Show Podcast: Harish Doddi on accelerating the path from prototype to production.

In this episode of the Data Show, I spoke with Harish Doddi, co-founder and CEO of Datatron, a startup focused on helping companies deploy and manage machine learning models. As companies move from machine learning prototypes to products and services, tools and best practices for productionizing and managing models are just starting to emerge. Today’s data science and data engineering teams work with a variety of machine learning libraries, data ingestion, and data storage technologies. Risk and compliance considerations mean that the ability to reproduce machine learning workflows is essential to meet audits in certain application domains. And as data science and data engineering teams continue to expand, tools need to enable and facilitate collaboration.

As someone who specializes in helping teams turn machine learning prototypes into production-ready services, I wanted to hear what Doddi has learned while working with organizations that aspire to “become machine learning companies.”

Continue reading Simplifying machine learning lifecycle management.

Categories: Technology

It's time to establish big data standards

O'Reilly Radar - Thu, 2018/08/16 - 04:00

The deployment of big data tools is being held back by the lack of standards in a number of growth areas.

Technologies for streaming, storing, and querying big data have matured to the point where the computer industry can usefully establish standards. As in other areas of engineering, standardization allows practitioners to port their learnings across a multitude of solutions, and to more easily employ different technologies together; standardization also allows solution providers to take advantage of sub-components to expeditiously build more compelling solutions with broader applicability.

Unfortunately, little has been done to standardize big data technologies so far. There are all sorts of solutions but few standards to address the challenges just mentioned. Areas of growth that would benefit from standards are:

  • Stream processing
  • Storage engine interfaces
  • Querying
  • Benchmarks
  • Security and governance
  • Metadata management
  • Deployment (including cloud / as a service options)
  • Integration with other fast-growing technologies, such as AI and blockchain

The following sections will look at each area.

Streaming

Big data came about with the influx of the high volumes and velocity of streaming data. Several products offer solutions to process streaming data, both proprietary and open source: Amazon Web Services, Azure, and innumerable tools contributed to the Apache Foundation, including Kafka, Pulsar, Storm, Spark, and Samza. But each has its own interface. Unlike SQL, there is no standard API or interface to handle this data, although Apache is now promoting a meta-interface called Beam. This makes it hard for solution providers to integrate with these rapidly evolving solutions.

Also, there is no easy way for Internet of Things (IoT) application developers to leverage these technologies interchangeably, and have portability so they don’t get tied down by proprietary interfaces—essentially the same guiding principles as were behind the ANSI SQL standards.

Storage engine interfaces

With the proliferation of a large number of NoSQL storage engines (CouchDB, Cassandra, HBase, MongoDB, etc.) we again face a plethora of incompatible APIs. In addition, new types of applications call for a radical rethinking of how to process data. Such rethinking includes document stores (with JSON becoming the prevalent data interchange format), and graph databases: Gremlin, SPARQL (which is a W3C standard), and Cypher as interfaces to Neo4J, JanusGraph, and other databases. GIS systems provide a very different model for interacting with their complex form of data. Apache Lucene, and related search engines, are also unique in the extensive capabilities they provide.

Applications cannot swap storage engines if needed. Also, SQL query engines such as Apache Trafodion, EsgynDB, Apache Spark, Apache Hive, and Apache Impala must interact independently with each storage engine.

Just as ODBC and JDBC facilitated the development of many BI and ETL tools that work with any database engine, a standard interface could facilitate access to data from any of these storage engines. Furthermore, it would substantially expand the ecosystem of solutions that could be used with the storage engine.

Finally, even though parallelism is important for data flow between the query and the storage engine, it is not facilitated by any standard interface. Partitioning can vary during such flows.

Querying

Data models supported by NoSQL databases differ just as much as their interfaces. The main standard with some applicability to big data is ANSI SQL. Although it was explicitly rejected in the first decade of the 2000s by many of the NoSQL databases, it has now been adopted by many of them as an alternative API because of its prevalence, its familiarity amongst developers, and the ecosystem supporting it. SQL is still evolving and is doing a credible job in handling the big data challenges. For instance, JSON support and Table Valued predicates were added in the 2016 standard.

But even SQL has not keep pace with the changes in the big data space, given that standards take a lot of collaboration, deliberation, and effort to get right and set. Two other familiar standards in the relational database world—ODBC and JDBC—have not changed much for quite some time, especially given the needs of big data to handle parallelism for large volumes of data, the variety of data structures and models, and the changed paradigm of streaming data velocity.

The SQL standard needs to evolve to support:

  • Streaming data
  • Publish/subscribe interfaces
  • Windowing on streams:
    • Time, count, and content triggers
    • Tumbling, sliding, and session windows
    • Event versus time processing
  • Rules for joining streams of data
  • Interfaces to sophisticated search solutions
  • Interfaces to GIS systems
  • Interfaces to graph databases, so that users can submit standard queries against graph databases and map the results to a tabular format, similar to the elegant JSON-to-relational mapping in the ANSI SQL 2016 standard for JSON
Benchmarks

The workloads for big data span the gamut from streaming to operational to reporting and ad hoc querying to analytical; many of these have real-time, near real-time, batch, and interactive aspects. Currently, no benchmark assesses the price/performance of these hybrid operational and analytical processing workloads. Many vendors claim to support these varied workloads, but definitions are lacking, and no benchmarks exist to test them.

To evaluate potential big data products, most customers turn to benchmarks created by the Transaction Processing Performance Council (TPC). The TPC-DS standard, intended to benchmark BI and analytical workloads, offered considerable promise. But this promise was subverted in two ways. First, vendors present customers with altered versions of the benchmark, distorted to favor the vendor's product. Many of the TPC-DS results shared by vendors do not account for common usage, including queries and other workloads running at different levels of concurrency and at the scale of big data, as outlined in the specification. Secondly, unlike most TPC standards, the TPC-DS standard was never bolstered by audited results that enable customers to assess relative price/performance.

Security and governance

There are various security and governance infrastructures for big data deployments. For instance, the Hadoop environment has Apache Ranger and Apache Sentry. Each cloud provider has security information and event management systems. For applications and solution providers to integrate with these environments is difficult, again, since each implementation has a different API.

Standardizing the API for security administration and monitoring would be very beneficial, allowing enterprises to use standard mechanisms to configure and enforce security. These standards would enable more solutions to use these security systems. Consider the integration of crucial security events across various enterprise data sources. If the security products used a standard API, it would be a lot easier for them to interface with any data source, providing the client more choice and flexibility. The same holds true when deploying role and user privileges across enterprise data sources.

More conveniently, when data is moved to another data storage system, or accessed by another sub-system, the access rights to that data could move automatically and without sysadmin effort, so that the same people would still have access to the same fields and rows, regardless of the tools they use to access that data.

Metadata management

With the proliferation of data across multiple storage engines, the metadata for that data needs a central repository. Regardless of whether a table is in HBase, ORC, Parquet, or Trafodion, if it were registered in one set of metadata tables, it would be much easier for client tools to access this data than the current situation, where clients have to connect to different metadata tables.

The proper goal here is to standardize the information schema for these client tools and centralize all security administration. That would present a federated view of all the database objects.

Extending this metadata with business information for the objects would facilitate governance and data lineage, instead of having to provide these services again across different metadata repositories. This would make metadata or master data management easier.

Deployment

Each cloud provider requires its own way to provision, configure, monitor, and manage a database and the resources it needs. This means that any client who wants to change cloud providers, or make databases on different providers work together, must change all their procedures and scripts, and perhaps much more.

A standard set of APIs would make this task easier, regardless of whether the customer was deploying the database on a public or private cloud, or as a hybrid deployment. Standards have been proposed but have not succeeded in the market. For instance, OpenStack has open source, community-driven interfaces for many of these tasks, but these have gained no adoption among the services chosen by most customers (Amazon.com, Google, Microsoft's Azure). VMware defined a vSphere standard some years ago, but it is almost completely ignored.

When cloud providers offer comparable services, these should be standard as well. For instance, an application that needs object storage such as AWS S3 or Azure Blob, should be able to get access through a standard interface.

Integration with other emerging technologies

It is also important to think about how to standardize the integration between databases and technologies such as machine learning algorithms and tools such as TensorFlow, R, Python libraries, the analysis of unstructured data such as sentiment analysis, image processing, natural language processing (NLP), blockchain, etc. Today, each solution in these areas has a unique interface. Therefore, integrating them with the database is always a custom effort.

Whereas user-defined functions and table user-defined functions have good standards, there is no way for one database to call user-defined functions, or even stored procedures, written for some other database. It would be so much more effective if users could use a large library of UDFs developed for any database as a plug-and-play technology. Such a library would also provide the developers of these functions a large customer base where their functions could be deployed.

Conclusion

The deployment of big data tools is now being held back by the lack of standards in the areas I have listed. The development of these standards is in no way intended to thwart innovation or keep providers from providing unique solutions. On the contrary—look at the ANSI SQL standard: it has facilitated the dramatic growth of a large number of database solutions.

An important aspect of standards is ensuring compliance. The National Institute of Standards and Technology (NIST) in the U.S. Department of Commerce did that for SQL. Although SQL has admirably broad adoption as a standard, this does not guarantee smooth interoperability. First of all, SQL has evolved, so vendors can pick and choose which version of the standard to adhere to. Even there, how much of that version they adhere to is not clear without a certification. Second, numerous non-standard enhancements are offered.

Some of the areas identified may provide more value to users and providers. A prioritization determining which standard would provide the most value, based on input from the user and vendor community, could help guide efforts to develop standards.

These efforts could be facilitated via new standards bodies or with the cooperation and under the tutelage of existing standards bodies such as ANSI, ISO, TPC, and W3C. The committee members of these standards organizations have tremendous experience in developing excellent standards. They can skillfully navigate the bumpy road to achieve consensus across participants who otherwise compete. But it is up to the end users and providers to apply the pressure. Do we think we can start a movement to do so?

Continue reading It's time to establish big data standards.

Categories: Technology

Four short links: 16 August 2018

O'Reilly Radar - Thu, 2018/08/16 - 03:55

Distributed Execution, Roaming SIM, Social Robot, and Bad Design

  1. Ray -- a flexible, high-performance distributed execution framework from OpenAI, targeting AI applications including reinforcement learning. (via "Notes from the first Ray meetup")
  2. KnowRoaming Global SIM Sticker -- Put your SIM card back in your phone. When you’re at home, the sticker. (via Engadget)
  3. Haru (IEEE Spectrum) -- inside Honda's new social robot.
  4. Botched CIA Communications System Helped Blow Agents' Cover (Foreign Policy) -- In the words of one of the former officials, the CIA had “fucked up the firewall” between the two systems. When bad systems architecture kills people...

Continue reading Four short links: 16 August 2018.

Categories: Technology

Site reliability engineering (SRE): A simple overview

O'Reilly Radar - Thu, 2018/08/16 - 03:00

Get a basic understanding of site reliability engineering (SRE) and then go deeper with recommended resources.

Curious about site reliability engineering (SRE)?

The following overview is for you. It covers some of the basics of SRE: what it is, how it’s used, and what you need to keep in mind before adopting SRE methods.

Continue reading Site reliability engineering (SRE): A simple overview.

Categories: Technology

Best practices to achieve balance between design and performance

O'Reilly Radar - Wed, 2018/08/15 - 04:00

Ways to bring designers and developers together to optimize user experience.

Continue reading Best practices to achieve balance between design and performance.

Categories: Technology

Notes from the first Ray meetup

O'Reilly Radar - Wed, 2018/08/15 - 04:00

Ray is beginning to be used to power large-scale, real-time AI applications.

Machine learning adoption is accelerating due to the growing number of large labeled data sets, languages aimed at data scientists (R, Julia, Python), frameworks (scikit-learn, PyTorch, TensorFlow, etc.), and tools for building infrastructure to support end-to-end applications. While some interesting applications of unsupervised learning are beginning to emerge, many current machine learning applications rely on supervised learning. In a recent series of posts, Ben Recht makes the case for why some of the most interesting problems might actually fall under reinforcement learning (RL), specifically systems that are able to act based upon past data and do so in a way that is safe, robust, and reliable.

But first we need RL tools that are accessible for practitioners. Unlike supervised learning, in the past there hasn’t been a good open source tool for easily trying RL at scale. I think things are about to change. I was fortunate enough to receive an invite to the first meetup devoted to RayRISE Lab’s high-performance, distributed execution engine, which targets emerging AI applications, including those that rely on reinforcement learning. This was a small, invite-only affair held at OpenAI, and most of the attendees were interested in reinforcement learning.

Here’s a brief rundown of the program:

  • Robert Nishihara and Philipp Moritz gave a brief overview and update on the Ray project, including a description of items on the near-term roadmap.
  • Eric Liang and Richard Liaw gave a quick tour of two libraries built on top of Ray: RLlib (scalable reinforcement learning) and Tune (a hyperparameter optimization framework). They also pointed a to a recent ICML paper on RLlib. Both of these libraries are easily accessible to anyone familiar with Python, and both should prove popular among industrial data scientists.
Figure 1. RLlib and reinforcement learning. Image courtesy of RISE Lab.
  • Eugene Vinitsky showed some amazing videos of how Ray is helping them understand and predict traffic patterns in real time, and in the process help researchers study large transportation networks. The videos were some of the best examples of the combination of IoT, sensor networks, and reinforcement learning that I’ve seen.
  • Alex Bao of Ant Financial described three applications they’ve identified for Ray. I’m not sure I’m allowed to describe them here, but they were all very interesting and important use cases. The most important takeaway for the evening was Ant Financial is already using Ray in production in two of the three use cases (and they are close to deploying Ray to production for the third)! Given that Ant Financial is the largest unicorn company in the world, this is amazing validation for Ray.

With the buzz generated by the evening’s presentations and early examples of production deployments beginning to happen, I expect meetups on Ray to start springing up in other geographic areas. We are still in the early stages of adoption of machine learning technologies. The presentations at this meetup confirm that an accessible and scalable platform like Ray, opens up many possible applications of reinforcement learning and online learning.

For more on Ray:

Continue reading Notes from the first Ray meetup.

Categories: Technology

Four short links: 15 August 2018

O'Reilly Radar - Wed, 2018/08/15 - 03:50

Retro Hacks, Timsort, e-ink UI, and Inside Time Zones

  1. TRS-80 Galaxy Invasion on an RC2014 -- I love these retro hacks. This uses a homebrew Z80 with a Raspberry Pi Zero (!) to do the video graphics, which is painful and burdensome otherwise.
  2. Timsort -- all you need to know about Python's sorting algorithm.
  3. PaperTTY -- Python module to render a TTY on e-ink.
  4. Working with Time Zones -- the graphs are such a brilliant way of explaining it!

Continue reading Four short links: 15 August 2018.

Categories: Technology

What do computers see?

O'Reilly Radar - Wed, 2018/08/15 - 03:30

Tricks to visualize and understand how neural networks see.

In the last decade or so, computer vision has made tremendous strides forward, mostly because of developments in deep learning. It is not just that we have new and better algorithms—digital cameras and the web have provided us with a near infinite set of training data. And maybe even more importantly, graphics cards developed for computer gaming turn out to have super computer powers when it comes to training deep neural networks.

This is all good news for anybody wanting to experiment with deep learning and image recognition. All it takes to build a cat versus dog classifier these days is Keras and a Python notebook with 100 lines of code. But doing this doesn't tell us what computers see.

If we wanted to understand how humans see, we could open their skulls and try to figure out how information flows from the eye’s photoreceptor cells through the visual cortex to the rest of the brain. This would be rather hard, though. It’s much easier to poke that opened-up brain with an electrode until the subject sees the color blue. So, how do we prod a neural net with an electrode?

Modern networks typically consist of a large number of layers stacked on top of each other. The image to be recognized gets fed into the lowest layer, and as the information travels through the network, the image representation becomes more abstract until at the other end, a label comes out and the network says, I see a cat!

Poking a neural network with an electrode boils down to running this process in reverse; rather than showing the network a picture and asking it what it sees, we’re going to give the network some noise and ask it to make changes such that a particular neuron has a maximum activation. The image that does that represents what this particular neuron sees, what a human would report seeing if we prodded that neuron.

Let’s start by loading a pre-trained image-recognition network into Keras:

model = vgg16.VGG16(weights='imagenet', include_top=False) model.summary()

That last statement shows us the structure of the network.

We define a loss function that optimizes the output of the given neuron, then create an iterate Keras function that optimizes toward that by changing the input image. We then start with an image filled with noise and run the iteration 16 times. (All code referred to in this post is available on GitHub as both a standalone script and a Python notebook. See the references at the end of the article.)

loss = K.mean(layer_output[:, :, :, neuron_idx]) grads = K.gradients(loss, input_img)[0] iterate = K.function([input_img], [loss, grads]) img_data = np.random.uniform(size=(1, 256, 256, 3)) + 128. for i in range(16): loss_value, grads_value = iterate([img_data]) img_data += grads_value * step

We can try this on some random neuron and we see some artifacts appear that seem to tickle this specific neuron:

That’s cute! Not quite what I’d imagine brain-surgery-induced hallucinations look like, but it is a start! This neuron was picked from a random layer. Remember as we go through the stacked layers, the abstraction level is supposed to go up. What we can do now is sample neurons from different layers and order them by layer; that way we get an idea of the sort of features that each layer is looking out for:

This aligns nicely with our intuition that abstraction goes up as we traverse the layers. The lowest layers are looking for colors and simple patterns; if you go higher, the patterns become more complex.

Unfortunately, there’s a limit to how far this trick will get us; if you find the bit in the highest layer that represents cat-ness, you can optimize all you want, but no obvious cats will roll out.

We can get more detail, though, by zooming in on the image as we run the optimization function. The idea here is if you optimize an image for neuron activation, it tends to get stuck in a local minimum since there is no good way for it to influence the overall structure of the image. So instead of starting with a full image, we start with a small 64x64 image that we optimize. We then scale the image up a bit and optimize again. Doing this for 20 steps gives us a nice and full result that has a certain plasticity to it.

The zooming itself is somehow pleasing, like something organic unfolding.

We can make this into a movie of arbitrary length if we keep zooming, but once we reach a certain size, we also need to start center cropping so the image always remains the same size. This movie has a nice fractal-like mesmerizing effect. But why stop there? We can loop through a set of random neurons while zooming, making for a wonderful acid-like movie:

In this post, we’ve seen some tricks to visualize what a neural network sees. The visualizations are interesting and somehow compelling by themselves, but they also give us an understanding into how computer vision works. As information flows through, the level of abstraction increases, and we can to some extent show those abstractions.

The notebook version of the code can be found on GitHub. A script-based version can be found on GitHub as well.

Continue reading What do computers see?.

Categories: Technology

3 promising areas for AI skills development

O'Reilly Radar - Tue, 2018/08/14 - 05:30

O'Reilly survey results and usage data reveal growing trends and topics in artificial intelligence.

One of the key findings of a survey we released earlier this year (How Companies are Putting AI to Work through Deep Learning) was that the leading reason holding companies back from incorporating deep learning was their lack of access to skilled people. One-fifth of respondents pointed to a skills gap as one of the reasons they haven’t integrated deep learning, and at the time of the survey, 75% of respondents indicated their company had some combination of internal and external training programs to address this issue.

We’ve continued to monitor interest in topics relevant to building AI products and systems, specifically areas that also warrant investment in skills development. In this post, I’ll share results of related studies we’ve conducted. I’ll draw from two data sources:

  • We examine usage[1] across all content formats on the O’Reilly online learning platform, as well as demand via volume of search terms.
  • We recently conducted a survey (full report forthcoming) on machine learning adoption, which included more than 6,000 respondents from North America.

I’ll use key portions of our upcoming AI Conference in San Francisco to describe how companies can address the topics and findings surfaced in these two recent studies.

Growing interest in key topics

Through the end of June 2018, we found double-digit growth in key topics associated with AI. Our online learning platform usage metrics encompass many content formats including books, videos, online training, interactive content, and other material:

Growth was strong across many topics associated with AI and machine learning. The chart below provides a sense of how much content usage (“relative popularity”) we’re seeing in some of these key topics: our users remain very interested in machine learning, particularly in deep learning.

It’s one thing to learn about an individual technology or a specific class of modeling techniques, but ultimately, organizations need to be able to design robust AI applications and products. This involves hardware, software infrastructure to manage data pipelines, and elegant user interfaces. For the upcoming AI Conference in San Francisco, we assembled training sessions, tutorials, and case studies on many of these important topics:

We’ve also found that interest in machine learning compares favorably with other areas of technology. We track interest in topics by monitoring search volume on our online learning platform. Alongside Kubernetes and blockchain, machine learning has been one of the fast-growing, high-volume search topics year over year:

Emerging topics

As I noted in the first chart above, we are seeing growing interest in reinforcement learning and PyTorch. It’s important to point out that TensorFlow is still by far the most popular deep learning framework, but as with other surveys we are seeing that PyTorch is beginning to build a devoted following. Looking closely at interest in topics within data science and AI, we found that interest in reinforcement learning, PyTorch, and Keras have risen substantially this year:

The chart below provides a ranked list of industries that are beginning to explore using reinforcement learning and PyTorch:

We’ve had reinforcement learning tutorial sessions and presentations from the inception of our AI Conference. As tools and libraries get simpler and more tightly integrated with other popular components, I’m expecting to see more mainstream applications of reinforcement learning. We have assembled tutorial sessions and talks at the AI Conference on reinforcement learning and on popular tools for building deep learning applications (including PyTorch and TensorFlow):

Toward a holistic view of AI applications

There is growing awareness among major stakeholders about the importance of data privacy, ethics, and security. Users are beginning to seek more transparency and control over their data, regulators are beginning to introduce data privacy rules, and there is growing interest in ethics and privacy among data professionals.

There are an emerging set of tools and best practices for incorporating fairness, transparency, privacy, and security into AI systems. For our upcoming AI Conference in San Francisco, we have a wide selection of tutorials and sessions aimed at both technologists wanting to understand how to incorporate ethics and privacy into applications, and for managers needing to understand what these new tools and technologies are able to provide:

[1]Analysis is based on non-personally-identifiable information about usage on O’Reilly’s online learning platform.

Continue reading 3 promising areas for AI skills development.

Categories: Technology

Four short links: 14 August 2018

O'Reilly Radar - Tue, 2018/08/14 - 03:50

Hyrum's Law, Academic Torrents, Logic Textbook, and Suboptimal Fairness

  1. Hyrum's Law -- With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody. (via Simon Willison)
  2. Academic Torrents -- a community-maintained distributed repository for data sets and scientific knowledge. 27GB and growing.
  3. Open Logic Project -- a collection of teaching materials on mathematical logic aimed at a non-mathematical audience, intended for use in advanced logic courses as taught in many philosophy departments. It is open source: you can download the LaTeX code. It is open: you’re free to change it whichever way you like, and share your changes. It is collaborative: a team of people is working on it, using the GitHub platform, and we welcome contributions and feedback. And it is written with configurability in mind.
  4. Delayed Impact of Fair Machine Learning (Paper a Day) -- it’s therefore possible to have a fairness intervention with the unintended consequence of leaving the disadvantaged group worse off than they were before.

Continue reading Four short links: 14 August 2018.

Categories: Technology

A quick reminder on HTTPS everywhere

O'Reilly Radar - Tue, 2018/08/14 - 03:30

HTTPS "everywhere" means everywhere—not just the login page, or the page where you accept donations. Everything.

HTTPS Everywhere! So the plugin says, and now browsers are warning users that sites not implementing https:// are security risks. Using HTTPS everywhere is good advice. And this really means "everywhere": the home page, everything. Not just the login page, or the page where you accept donations. Everything.

Implementing HTTPS everywhere has some downsides, as Eric Meyer points out. It breaks caching, which makes the web much slower for people limited to satellite connections (and that's much of the third world); it's a problem for people who, for various reasons, have to use older browsers (there are more ancient browsers and operating systems in the world than you would like to think, trust me); domain names and IP address are handled by lower-level protocols that HTTPS doesn't get to touch, so it's not as private as one would like; and more. It's not a great solution, but it's a necessary one. (Meyer's article, and the comments following it, are excellent.)

The real problem isn't HTTPS's downsides; it's that I see and hear more and more complaints from people who run simple non-commercial sites asking why this affects them. Do you need cryptographic security if your site is a simple read-only, text-only site with nothing controversial? Unfortunately, you do. Here's why. Since the ISPs' theft of the web (it's not limited to the loss of Network Neutrality, and not just an issue in the U.S.), the ISPs themselves can legally execute man-in-the-middle attacks to:

The first two are happening already; the third may be. (It's possible that GDPR, which protects European citizens regardless of where they're located, might prevent ISPs from collecting and selling browsing history. I wouldn't count on it, though.)

Yesterday, I poked around a bit and found many sites that don't use HTTPS everywhere. Those sites include an ivy league university (Cornell, get your act together!), many non-profit organizations (including several that I belong to), several well-known newspapers and magazines, local libraries, and a lot of small businesses. The irony is most of these sites accept donations, let you read restricted-access materials, and even sell stuff online, and these pages are already using HTTPS (though not always correctly). Protecting the entire site doesn't require that big a change. In many cases, using HTTPS for the entire site is simpler than protecting a limited collection of pages.

I agree that HTTPS is a significant administrative burden for simple, static sites, or sites that are run by groups that don’t have the technical ability to implement HTTPS. Services like Let's Encrypt reduce some of the burden (Let's Encrypt provides free certificates, reducing the process of setting up HTTPS to a few well-placed clicks)—but you still have to do it.

Nothing stays simple and elegant, particularly when it's under attack. And the web is under attack: from the pirate ISPs, from hostile governments (a somewhat different issue, but related), and from many other actors. HTTPS is a solution—a problematic one, I’ll grant, and one that imposes a burden on the sites least capable of dealing with technical overhead, but we don't have a better solution. Well, yes, we do have a better solution: IPSec and IPv6 solve the problem nicely. But we've been waiting for widespread adoption of those for more than 20 years, and we're still waiting. These are problems we need to solve now.

"There's nothing on my site that requires cryptography" doesn't help anyone. That's no more helpful than people who say "I have nothing to hide, so I don't need privacy." You don't need either privacy or HTTPS until you do, and then it's way, way too late. Do it for your users' sake.

Continue reading A quick reminder on HTTPS everywhere.

Categories: Technology

Four short links: 13 August 2018

O'Reilly Radar - Mon, 2018/08/13 - 06:25

Algorithms, Feedback, Transliteration, and Diagnosis

  1. Dijkstras in Disguise -- It turns out that many algorithms I've encountered in my computer graphics, finance, and reinforcement learning studies are all variations of this relaxation principle in disguise. [...] This blog post is a gentle tutorial on how all these varied CS topics are connected.
  2. The Blacker the Box -- The thesis of this post is: The faster the feedback on prediction accuracy, the blacker the box can be. The slower the feedback, the more your models should be explicit and formal.
  3. Design Challenges in Named Entity Transliteration -- In order to improve availability of bilingual named entity transliteration data sets, we release personal name bilingual dictionaries mined from Wikidata for English to Russian, Hebrew, Arabic, and Japanese Katakana. Our code and dictionaries are publicly available. GitHub.
  4. AI Spots Fibromyalgia -- A machine-learning algorithm that was programmed to recognize this neurological signature was able to use it to predict which brain scans were indicative of fibromyalgia and which were not. [...] López-Solà’s research is compelling evidence to convince those who are reluctant to accept the existence of fibromyalgia. Interesting because medical science argues whether the condition is real, but the software can reliably identify something.

Continue reading Four short links: 13 August 2018.

Categories: Technology

Security Meeting Topic for Thursday 6/16

PLUG - Sun, 2018/08/12 - 21:03
This month Aaron Jones will present "Introduction To Remote Access Tools (RATS)"   Introduction To Remote Access Tools (RATS) is a two-hour course designed to provide an overview of what a RAT is, how it is created, and what kinds of legal and illegal behaviors are being facilitated by their deployment.   You are invited and we encourage you to invite any one you like. This is open to the public as well as law enforcement or anyone else who would like to come. I think it will be a good class and hope to see you there.

More information available from: https://phxlinux.org/index.php/meetings/20-plug-security.html

Four short links: 10 August 2018

O'Reilly Radar - Fri, 2018/08/10 - 04:05

CS for DS, Operations Research, Coding Sandpit, and GitHub's Load Balancer

  1. The Best Books on Computer Science for Data Scientists (Hadley Wickham) -- a solid list.
  2. OR Tools -- an open source, fast, and portable software suite for solving combinatorial optimization problems.
  3. Pencil Code -- a collaborative programming site for drawing art, playing music, and creating games. It is also a place to experiment with mathematical functions, geometry, graphing, webpages, simulations, and algorithms. Programs are open for all to see and copy. In my head as a good "what next after Scratch?"
  4. GLB Director: GitHub Load Balancer -- a Layer 4 load balancer that scales a single IP address across a large number of physical machines while attempting to minimize connection disruption during any change in servers.

Continue reading Four short links: 10 August 2018.

Categories: Technology

3 core principles of design thinking

O'Reilly Radar - Fri, 2018/08/10 - 04:00

How design thinking works, and how it integrates with product development.

Continue reading 3 core principles of design thinking.

Categories: Technology

Skydeck turns research into reality

O'Reilly Radar - Thu, 2018/08/09 - 06:45

UC Berkeley’s startup accelerator takes university research discoveries and helps translate them into marketable products.

The Berkeley acceleration method

The Berkeley Skydeck office is located in the penthouse of the tallest building in downtown Berkeley. Sweeping views of the bay are accessible from virtually every face of the large, open-concept space, surely providing inspiration to the teams selected for this spring’s portfolio. There is a modest kitchen, plenty of desk space, and perhaps most importantly, a large quantity of coffee.

Skydeck has built a space for their teams to work comfortably, and hopefully effectively, with the idea that they can accomplish most tasks in-house. Cultured through workshops, mentorship experiences, and a creative environment to develop in, teams are given the resources necessary to strengthen each of these facets. At the start of the portfolio period, each startup is given $50,000 in investments from the Berkeley SkyDeck Fund, with an additional $50,000 to follow after completion of half of the program. Over the course of six months, teams complete the Berkeley Acceleration Method (BAM) where they attend mandatory training in each of the six practice areas:

  1. Design

  2. Funding plan

  3. Business model

  4. Team development

  5. Product story

  6. Market traction

These training sessions are conducted by specialists who focus solely on one area. For example, David Riemer, lecturer at the UC Berkeley Haas School of Business, helps teams develop their product story and Hilary Weber, founder of Opportu, offers guidance in team development.

A Skydeck startup’s resources

Beyond useful resources offered in the office, the accelerator also provides an abundance of human capital to its portfolio teams in the form of 115+ advisors and 27 partners. These advisors provide expertise on developing business models, raising funding, and so many other tasks that are key to growing a startup. The accessibility of these advisors is a significant feature of the program. Each team is matched with a lead advisor to remain in direct contact with throughout their participation in the program. Teams are also encouraged to contact anyone on the advisory board and to attend office hours for one-on-one counseling regarding branding, intellectual property, product development, and more. The advisors come from extremely diverse backgrounds, ranging anywhere from being a seed investor to being a chief scientist, providing the portfolio teams with a variety of consultants.

Figure 1-1. Skydeck’s penthouse office in Downtown Berkeley (Source: Skydeck)

Skydeck participants also have access to a variety of technical resources. The Haas Startup Squad is a group of current MBA students from the Haas School of Business who help the teams with business applications, such as market research. There is also a group of Berkeley post-docs that are available for technical science questions regardless of the topic or field. From a design perspective, there are Berkeley students who will assist with graphic design projects and industrial designing.

In terms of on-campus resources, Skydeck startups also have access to the Molecular Foundry and the Jacobs Institute. The Molecular Foundry is a component of the Lawrence Berkeley National Labs. Through the foundry, researchers have access to resources for nanofabrication, nanostructure imaging, and synthetic biology protocols. The Jacobs Institute is geared towards industrial engineering and provides community access to 3-D printers, an electronics lab, machine shops, and more.

Funding the accelerator program

With the vast amount of resources available to teams and the up to $100,000 investment in each startup, Skydeck must raise a large quantity of funding to support its high-cost program. The Berkeley SkyDeck Fund is supported by global venture capital investors with two major partnership levels.

Silver Bear Partners can invest into in-depth connections at Skydeck with a seat on the Skydeck External Advisory Board, access to Skydeck workshops, the ability to host a Skydeck workshop, and more. Both Ford and Intel are currently Silver Bear Partners. Golden Bear Partners carry a bit more weight with access to the entire Skydeck review pipeline (accounting for more than 1,000 startups each year), selection of a team for incubation through the bi-annual program, two seats on the Skydeck External Advisory Board, and all of the original benefits of the Silver Bear Partner program. Other ways for outside involvement in Skydeck include being a vendor, sponsor, resource, or global partner. Current Golden Bear Partners include Ericsson and Analog Devices.

Additionally, the $100K investment provided by Skydeck is in exchange for an equivalence 5% equity. Half of the carry profits from the whole process end up donated back to Skydeck and to UC Berkeley in the hopes that these funds will provide for the next generation of Berkeley entrepreneurs.

The Skydeck portfolio

Skydeck offers an agnostic selection process for their seasonal startups, resulting in an extremely diverse group of teams each term. The only real selection criteria for each funding season is that one of the startup team members must be affiliated with UC Berkeley in some way, whether it is through being a student, alumni, or even a professor at the university. Since 2012, Skydeck has hosted 150 startups that include companies such as Zephyrus Biosciences, First Derm, and CinderBio.

Zephyrus Biosciences creates technology that automates a key technique in wet lab research, performing electrophoresis. The company’s instrument enables automation of electrophoresis and photocapture to analyze protein expression of cell suspensions. The company was acquired by Bio-Techne Corporation in 2016 and was integrated into the Proteins Platform Division.

First Derm is an online platform to connect patients to dermatologists in under 24 hours. Users submit two pictures of their affected skin area, submit a case to First Derm’s team of dermatologists, and receive feedback from the company’s online doctors. First Derm’s quick review of cases often prevents unnecessary trips to a doctor’s office, potentially making their $29 case fee the economical option for patients.

CinderBio creates enzyme technology from volcanic water microbes that remain active at high heat and extremely low pH. The company’s enzymes allow researchers to reduce the costs of basic lab enzyme experiments due to the efficiency of the ultra-stable technology. CinderBio’s product has applications in research as well as in biofuels and the company received NSF awards in both 2015 and 2016.

The select startups for this Spring 2018 portfolio range anywhere from working on an online breastfeeding resource to designing technology that combats antibiotic resistance.

May & Meadow is a digital health technology addressing breastfeeding with a combination of clinical science and sensor diagnostics. The company aims to improve data collection on feeding in order to provide mothers with personalized information and to clear up questions regarding their baby’s nutritional health.

Sublime Therapeutics and Nextbiotics both intend to impact clinical medicine. Sublime focuses on developing anti-gastrin technology to target gastrointestinal, pancreatic, and neuroendocrine cancers. Nextbiotics uses CRISPR and bacteriophage technology to reverse antibiotic resistance for applications in humans, as well as in the environment.

BioXplor is an AI platform for use in life science discoveries. An intersection between data collection and computing, BioXplor’s technology enables hypothesis generation and the development of novel predictions for research and development with drug discovery. Data is gathered from electronic medical records, genomics data, and other relevant resources. This data is then interpreted via the company’s accelerated and automated process in order to provide useful insights to its users.

All of the companies seek to address an impactful problem with an innovative solution and Skydeck offers the resources and expertise to take each company from startup to market-impactor. While other accelerators, such as IndieBio, focus on a specific sector of the market, Skydeck encourages any Berkeley-affiliated team with a feasible vision to apply to the program. This promotes a diverse group of startups that have the potential to affect society in many different ways, through many different technologies.

For more information on the Skydeck program, please visit the website (http://skydeck.berkeley.edu/). Other teams participating in the spring 2018 Skydeck portfolio can be found through the following link: http://skydeck.berkeley.edu/portfolio/.

Continue reading Skydeck turns research into reality.

Categories: Technology

Collaborative design fuels shared understanding—the fundamental currency of Lean UX

O'Reilly Radar - Thu, 2018/08/09 - 05:55

Lean UX begins with the idea that user experience design should be a collaborative process.

As you navigate through the rest of your life, be open to collaboration. Other people and other people’s ideas are often better than your own. Find a group of people who challenge and inspire you, spend a lot of time with them, and it will change your life.

—Amy Poehler

What is a “user experience”? It’s the sum total of all of the interactions a user has with your product and service. It’s created by all of the decisions that you and your team make about your product or service: the way you price it, the way you package and sell it, the way you onboard users, the way you support it and maintain it and upgrade it, and so on and so on. In other words, it’s created by a team, not an individual user interface designer. For this reason, Lean UX begins with the idea that user experience design should be a collaborative process.

Figure 1-1. The Lean UX cycle

Lean UX brings designers and nondesigners together in co-creation. It yields ideas that are bigger and better than their individual contributors. But it’s not design-by-committee. It’s a process that is orchestrated and facilitated by designers, but one that’s executed by specialists working in their individual discipline who work from a common playbook you create together. Lean UX increases your team’s ownership over the work by providing an opportunity for individual points of view to be shared much earlier in the process.

In this chapter we’ll explore the many benefits that come from this close, cross-functional collaboration. Specifically, we’ll look at the following:

  • Why everybody gets to design

  • How low-fidelity artifacts increase collaboration

  • Building a shared understanding across your team

We’ll also look at a set of techniques that enable this more productive way of working:

  • Design Studio—a collaborative sketching exercise for the entire team

  • Design systems and style guides—living repositories of all the customer-facing elements of your product

  • Collaboration techniques for geographically distributed teams

Let’s dig in...

Collaborative Design

In not available, you learned about hypotheses. To test your hypotheses, you sometimes simply conduct research. But other times, you need to design and build something that will help you to test these hypotheses. For example, if you’re in the early stage of a project, you might test demand by creating a landing page that will measure how many customers sign up for your service. Or if you’re later in the product lifecycle, you might be working at the feature level—adding some new functionality that will make users more productive, for example. Navigating the many possible design options for these features can be difficult for teams. How often have you experienced team conflict over design choices?

The most effective way we’ve found to rally a team around a design direction is through collaboration. Over the long haul, collaboration yields better results than hero-based design (the practice of calling in a designer or design team to drop in, come up with something beautiful, and take off to rescue the next project). Teams rarely learn or get better from working with heroes. Instead, in the same way that creating hypotheses together increases the Product IQ of the team, designing together increases the Design IQ of the team. It allows all of the members of the team to articulate their ideas. It gives designers a much broader set of ideas to draw upon as they refine the design. This, in turn, increases the entire team’s feelings of ownership in the work. Finally, collaborative design builds team-wide shared understanding. It is this shared understanding that is the currency of Lean UX. The more the team collectively understands, the less it has to document in order to move forward.

Collaborative design is an approach that allows a team to design together. It helps teams build a shared understanding of both the design problem and the solution. It provides the means for them to work together to decide which functionality and interface elements best implement the feature they want to create.

Collaborative design is still a designer-led activity. It’s the designer’s responsibility to not only call collaborative design meetings but to facilitate them, as well. Sometimes, you’ll have informal chats and sketching sessions. Sometimes, more structured one-on-one sessions with a developer at a whiteboard. Other times, you will gather the entire team for a Design Studio exercise. The key is to collaborate with a diverse group of team members.

In a typical collaborative design session, teams sketch together, critique the work as it emerges, and ultimately converge on a solution they feel has the greatest chance of success. The designer, while still producing designs, takes on the additional role of facilitator to lead the team through a series of exercises.

The output of these sessions typically consists of low-fidelity sketches and wireframes. This level of fidelity is important. First, it makes it possible for everyone to contribute, even team members with less sophisticated drawing skills. Second, it’s critical to maintaining the malleability of the work. This gives the team the ability to pivot quickly if their tests reveal that the approach isn’t working. It’s much easier to pivot from a failed approach if you haven’t spent too much time laboriously drawing, documenting, and detailing that approach.

Collaborative Design: The Informal Approach

A few years ago, Jeff was designing a dashboard for a web app targeted at TheLadders’ recruiter and employer audience. There was a lot of information to fit on one screen and he was struggling to make it all work. Instead of burning too much time at his desk pushing pixels, he grabbed a whiteboard and asked Greg, the lead developer, to join him. Jeff sketched his original idea about how to lay out all of the content and functionality for this dashboard (see Figure 1-2). The two of them then discussed the idea, and eventually Jeff handed Greg the marker. He sketched his ideas on the same whiteboard. They went back and forth, ultimately converging on a layout and flow that they felt was both usable and feasible, given that they needed to deliver a solution within the current two-week sprint. At the end of that two-hour session, they returned to their desks and began working. Jeff refined the sketch into a more formal wireframe and workflow while Greg began to write the infrastructure code necessary to get the data they needed to the presentation layer.

--> Figure 1-2. Examples of whiteboard sketches

They had built a shared understanding through their collaborative design session. They both knew what they were going to build and what the feature needed to do. They didn’t need to wait to document it. This allowed them to get the first version of this idea built within a two-week time frame.

Conversation: Your Most Powerful Tool

Lean UX promotes conversation as the primary means of communication among team members. In this way, it is very much in line with the Agile Manifesto that promotes “Individuals and interactions over processes and tools.” Conversation unites a team around a shared vision. It also brings insights from different disciplines to the project much earlier than a traditional design cycle would allow. As new ideas are formed or changes are made to the design, a team member’s insight can quickly challenge those changes in a way the designer alone wouldn’t have recognized.

By having these conversations early and often, the team is aware of everyone’s ideas and can get started on their own work earlier. If they know that the proposed solution requires a certain backend infrastructure, for example, the team’s engineers can get started on that work while the design is refined and finalized. Parallel paths for software development and design are the fastest route to reach an actual experience.

These conversations might seem awkward at first; after all, you’re breaking down time-tested walls between disciplines. As the conversation evolves, however, designers provide developers with input on the implementation of certain features, ensuring the proper evolution of their vision. These conversations promote transparency of process and progress. This transparency builds a common language and deeper bonds between team members. Teammates who trust one another are more motivated to work together to produce higher-quality work.

Find ways to have more conversations with your teammates, both work-related and not. Time spent cultivating social ties with your team—eating meals together, for example—can make work-related conversations easier, more honest, and more productive.

Collaborative Design: A More Structured Approach

When your team is comfortable collaborating, informal sessions like the one we’ve just described take place all the time. But sometimes, you are going to need to gather everyone for a formal working session. Design Studio is a popular way to do this.1

This method, born in the architecture world where it was called Design Charrette, is a way to bring a cross-functional team together to visualize potential solutions to a design problem. It breaks down organizational silos and creates a forum for your fellow teammates’ points of view. By putting designers, developers, subject matter experts, product managers, business analysts, and other competencies together in the same space focused on the same challenge, you create an outcome far greater than working in silos allows. It has another benefit. It begins to build the trust your team will need to move from these formal sessions to more frequent and informal collaborations.

Running a Design Studio

The technique described in the sections that follow is very specific; however, you should feel comfortable to run less or more formal Design Studios as your situation and timing warrants. The specifics of the ritual are not the point as much as the activity of solving problems with your colleagues and clients.

Setting

To run a Design Studio session, you’ll want to find a dedicated block of time within which you can bring the team together. You should plan on at least a three-hour block. You’ll want a room with tables that folks can gather around. The room should have good wall space, so you can post the work in progress to the walls as you go.

The team

The process works best for a team of five to eight people. If you have more people, you can just create more teams and have the teams compare output at the end of the process. (Larger groups take a long time to get through the critique and feedback steps, so it’s important to split groups larger than about eight people into smaller teams who can each go through the following process in parallel, converging at the end.)

Process

Design Studio works within the following flow:

  1. Problem definition and constraints

  2. Individual idea generation (diverge)

  3. Presentation and critique

  4. Iterate and refine in pairs (emerge)

  5. Team idea generation (converge)

Supplies

Here’s what you’ll need:

  • Pencils

  • Pens

  • Felt-tip markers or similar (multiple colors/thickness)

  • Highlighters (multiple colors)

  • Sketching templates (you can use preprinted one-up and six-up templates or you can use blank sheets of 11″ x 17″ [A3] paper divided into 6 boxes)

  • 25″ x 30.5″ (A1) self-stick easel pads

  • Drafting dots (or any kind of small stickers)

Problem definition and constraints (15–45 minutes)

The first step in Design Studio is to ensure that everyone is aware of the problem you are trying to solve, the assumptions you’ve declared, the users you are serving, the hypotheses you’ve generated, and the constraints within which you are working. This can be a formal presentation with slides or it can be a group discussion.

Individual idea generation (10 minutes)

You’ll be working individually in this step. Give each member of the team a six-up template, which is a sheet of paper with six empty boxes on it, as depicted in Figure 1-3. You can make one by folding a blank sheet of 11″ x 17″ paper or make a preprinted template to hand to participants. (Some teams like to hand out small individual whiteboards to each participant. These are great because they’re easy to erase and tend to make people feel relaxed.)

Figure 1-3. A blank “six-up” template

Sometimes, people find they have hard time facing a blank page. If that’s the case, try this optional step. Ask everyone to label each box on their sheets with one of your personas and the specific pain point or problem they will be addressing for that persona. Write the persona’s name and pain point at the top of each of the six boxes. You can write the same persona/pain point pair as many times as you have solutions for that problem or you can write a different persona/pain point combination for each box. Any combination works. Spend five minutes doing this.

Next, with your six-up sheets in front of you, give everyone five minutes to generate six, low-fidelity sketches of solutions (see Figure 1-4 and Figure 1-7) for each persona/problem pair on their six-up sheet. These should be visual articulations (UI sketches, workflows, diagrams, etc.) and not written words. Encourage your team by revealing the dirty secret of interaction design to level the playing field: If you can draw a circle, square, and a triangle, you can draw every interface. We’re confident everyone on your team can draw those shapes.

Figure 1-4. A wall full of completed six-up drawings Presentation and critique (3 minutes per person)

When time is up, share and critique what you’ve done so far. Going around the table, give the participants three minutes to hold up their sketches and present them to the team (Figure 1-5). Presenters should explicitly state who they were solving a problem for (in other words, what persona) and which pain point they were addressing, and then explain the sketch. Each member of the team should provide critique and feedback to the presenter. Team members should focus their feedback on clarifying the presenter’s intentions.

Giving good feedback is an art: In general, it’s better to ask questions than to share opinions. Questions help the team talk about what they’re doing, and help individuals think through their work. Opinions, on the other hand, can stop the conversation, inhibit collaboration, and put people on the defensive. So, when you’re providing critique, try to use questions like, “How does this feature address the persona’s specific problem?” Or, “I don’t understand that part of the drawing. Can you elaborate?” Questions like these are very helpful. Comments such as, “I don’t like that concept,” provide little value and don’t give the presenter concrete ideas to use for iterating.

Figure 1-5. A team presenting and critiquing drawings during a Design Studio

Make sure that every team member presents and receives critique.

Pair up to iterate and refine (10 minutes)

Now ask everyone to pair up for the next round. (If two people at the table had similar ideas, it’s a good idea to ask them to work together.) Each pair will be working to revise their design ideas (Figure 1-6). The goal here is to pick the ideas that have the most merit and develop a more evolved, more integrated version of those ideas. Each pair will have to make some decisions about what to keep, what to change, and what to throw away. Resist the temptation here to create quick agreement by getting more general or abstract. In this step, you need to make some decisions and get more specific. Have each pair produce a single drawing on an 11″ x 17″ (A3) six-up sheet. Give each team 10 minutes for this step.

When the time is up, ask the team to go through the present-and-critique process again.

Figure 1-6. A team working together in a Design Studio exercise Team idea generation (45 minutes)

Now that all team members have feedback on their individual ideas and people have paired up to develop ideas further, the team must converge on one idea. In this step, the team is trying to select the ideas they feel have the best chance for success. This set of ideas will serve as the basis for the next step in the Lean UX process: creating an MVP and running experiments (both covered in the next chapter).

Ask the team to use a large sheet of self-stick easel pad paper or a whiteboard to sketch the components and workflow for their idea. There will be a lot of compromise and wrangling at this stage, and to get to consensus, the team will need to prioritize and pare back features. Encourage the team to create a “parking lot” for good ideas that don’t make the cut. This will make it easier to let go of ideas. Again, it’s important to make decisions here: resist the temptation to get consensus by generalizing or deferring decisions.

(If you have split a large group into multiple teams in the Design Studio, ask each team to present their final idea to the room when they are finished for one final round of critique and feedback, and if desired, convergence.)

Using the output

Before you break, decide on next steps. You can use the designs you’ve created in this process as the basis for building MVPs, for running experiments, for production design and development—the process is very versatile. Just ensure that, having asked people to spend a lot of time contributing to the final design, you treat their contribution with respect. Decide together on next steps and then stay on top of the progress so that people keep their commitments and follow through.

To keep the output visible, post it on a design wall or another prominent place so that the team can refer back to it. Decide on what (if any) intermediate drawings people want to keep and display these alongside the final drawing, again so that team members can refer back to the ideas. Regardless of what you keep posted on the wall, it’s generally a good idea to photograph everything and keep it in an archive folder of some sort. You never know when you’ll want to go back to find something. It’s also a good idea to put a single person in charge of creating this archive. Creating some accountability will tend to ensure that the team keeps good records.

Figure 1-7. Output of a Design Studio session Design Systems

So far in this chapter, we’ve focused on the ways that teams can design together. In practice, this usually means that teams are sketching together, either on paper or at a whiteboard. It almost never means that teams are sitting together at a workstation moving pixels around. In fact, this kind of group hovering at the pixel level is what most designers would consider their worst nightmare. (To be clear: don’t do this.)

And yet, design isn’t done when the sketch is done. It’s not completed at the whiteboard. Instead, it’s usually just getting started. So how do we get design to the pixel level? How do we get to finished visual design?

Increasingly, we’re seeing teams turn to design systems. Design systems are like style guides on steroids. They were an emerging species when we completed the first edition of this book but have now become an accepted best practice for digital product teams. Large organizations like Westpac (see Figure 1-8) and GE use them. Technology-native companies like MailChimp and Medium and Salesforce and countless others use them, too. Even the US Federal Government has released a design system. There are even entire two-day conferences dedicated to them. But, before we get into why design systems are having their moment, let’s talk about what they are.

Figure 1-8. The GEL design system website from Westpac Design Systems: What’s in a Name?

Style guides. Pattern libraries. Brand guidelines. Asset libraries. There’s not a lot of common language in this part of the design world, so let’s take a moment to clarify our terms.

For years, large organizations created brand guidelines—comprehensive documents of brand design and usage rules for those companies. In predigital days, these guidelines were documents, sometimes a few pages, but frequently large, comprehensive bound volumes. As the world moved online, these books sometimes moved onto the Web as PDF documents, web pages, or even wikis.

At the same time, publishers and publications often maintained style guides that covered rules of writing and content presentation. College students in the United States are familiar with the comforting strictness of The Chicago Manual of Style, The MLA Style Manual and Guide to Scholarly Publishing, and others.

The computing world’s version of a style guide is exemplified by Apple’s famous Human Interface Guidelines (HIG). The HIG is a comprehensive document that explains every component in Apple’s operating system, provides rules for using the components, and contains examples that demonstrate proper use of the components.

Finally, developers are familiar with asset libraries. These collections of reusable code elements are intended to make the developer’s job easier by providing tested, reusable code that’s easy to download from an always-current code repository.

As with many ideas in the digital world, digital design systems (which we’ll call design systems for the sake of brevity) are a kind of mash up of all of these ideas. A good design system contains comprehensive documentation of the elements of a design, rules and examples that govern the use of these elements, and crucially, contains the code and other assets that actually implement the design.

In practice, a design system functions as a single source of truth for the presentation layer of a product. Teams can sketch at the whiteboard and then quickly use the elements found in the design system to assemble a prototype or production-ready frontend.

The Value of Design Systems

Design systems are a powerful enabler of Lean UX. They allow the visual and microinteraction details of a design to be developed and maintained in parallel with the other decisions a team makes. So decisions like screen structure, process flow, information hierarchy—things that can be worked out at the whiteboard—can be handled by the right group of teammates, whereas things like color, type, and spacing can be handled by another (very likely overlapping) group of folks.

This has a couple of big benefits for teams:

  • It allows the team to design faster, because they’re not reinventing the wheel every time they design a screen.

  • It allows the team to prototype faster, because frontend developers are working from a kit of parts—they don’t need to recreate the elements of a solution each time, they can just go get the appropriate pieces out of the design system.

It also has some big benefits for organizations:

Increased consistency
A good design system is easy for developers to use. So they are more likely to use parts that they find in the design system, and less likely to “roll their own.” This means a greater likelihood that their work will adhere to brand standards.
Increased quality
By centralizing the design and creation of user-facing elements, you can take advantage of the work of a few highly trained and highly specialized designers and UI developers. Their high-quality work can be implemented by other less-specialized developers in the organization to produce top-notch results.
Lower costs
A good design system is not free. It requires investment to build it and staff to maintain it. But over time, it pays for itself by providing tools and frameworks that make the users of the system—the other developers in the organization—more efficient and more productive. It allows new designers to come up to speed more quickly, for example, because it documents all of the frontend conventions used in an app. Similarly, it allows new developers to come up to speed more quickly, because the basic building blocks of their work are available in an easy-to-use framework.
Case Study: GE Design System

In 2012, GE opened GE Software in San Ramon, California. This new “Center of Excellence” (CoE) was designed to help GE improve its software game. A few years earlier, a strategic review helped the company to see just how central software had become to their business—measured in lines of code, GE was something like the 17th largest software company in the world. And yet they felt they were not treating software development with the focus it deserved.

San Ramon included a new team at GE: the GE Software User Experience Team. This small team at the heart of a giant company created their first design system in 2013 in order to scale the impact they could have. Indeed, with fewer than 50 designers to collaborate with more than 14,000 developers (inside an organization of more than 300,000 people), there was no way that this startup design team could grow quickly enough to have a meaningful effect at GE.

The team’s first design system, called IIDS, for the Industrial Internet Design System, was designed by a group of internal designers with the help of a small team from Frog Design, one of the leading design firms in the world. The team built the system on top of Bootstrap, the HTML/CSS framework created by Twitter. It proved incredibly successful. Within a few years it had been downloaded by internal developers more than 11,000 times and had been used to create hundreds of applications. It helped software teams across the company produce better looking, more consistent applications. And, perhaps just as important, it created a huge amount of visibility for the software team and the UX team at San Ramon.

With that success came some problems. To be sure, simply having a good UI kit doesn’t mean that a team can produce a well-designed product. Design systems don’t solve every design problem. And Bootstrap was showing its limits as a platform choice. It had helped the team achieve their first objectives: get something out quickly, provide broad coverage of UI elements, and create wide adoption by being easier to use than “roll-your-own” solutions. But Bootstrap was hard to maintain and update and was just too big for most needs.

In 2015, GE Software, having had great success as an internal service bureau, morphed into GE Digital, a revenue-generating business in its own right. Their first product was called Predix (Figure 1-9), a platform on top of which developers inside and outside of GE can build software for industrial applications. And with this change of strategy, the team realized they needed to rethink their design system. Whereas earlier the goal had been to provide broad coverage and broad adoption, the new design system would be driven by new needs: it needed to enable great Predix applications, which was a more focused problem than before. It needed to limit the number of UI choices rather than supporting every possible UI widget. It still needed to be easy to adopt and use—it was now intended for use by GE customers—but now it was imperative that it be easy to maintain, as well.

The design system team had by this time grown to about 15 people and included design technologists (frontend developers who are passionate about both design and code), interaction designers, graphic designers, a technical writer, and a product owner.

Figure 1-9. The GE Predix Design System

The team chose to move the design system to a new technology platform. No longer based on Bootstrap, the system has instead been created with Polymer, a JavaScript framework that allows the team to implement Web Components. Web Components has emerged in the last few years as a way to enable more mature frontend development practices.

To create the new design system, the team spent nearly six months prototyping. Significantly, the team did not work in isolation. Instead, they paired with one of the application teams, and thus were designing components to meet the needs of their users—in this case the designers and developers working on the application teams. This point is really important. Collaborative design takes many forms. Sometimes it means designing with your cross-functional team. Sometimes it means designing with your end users. In this instance, it was a hybrid: designing with a cross-functional team of designers and developers who actually are your users.

Figure 1-10. The GE Predix Design System on GitHub Creating a Design System

As the GE story illustrates, there’s more than one way to create a design system, and the choices you and your team make should be driven by the goals you have for the work and capabilities at your disposal. GE is a company with a large enough budget to hire excellent consultants to get the effort started, and the resources to create and dedicate a team to the effort. Is that realistic for your organization? And what goals must your design system support? Is widespread adoption important? Do you need broad coverage from day one or can you build the system over time? All of these questions will drive the approach you take. With that in mind though, here are some common themes to consider as you create your own design system.

Characteristics of successful design systems and style guides

Whether you are creating a full-blown design system or a more limited style guide, consider these important characteristics:

It takes into account audience needs
Remember that the audience for your style guide is the entire product team. Designers, developers, QA people, will all rely on the design system for their work. Include them on the team that creates the system and make sure the contents of the system reflect their needs.
Continual improvement
Design systems must be considered living documents. They must be a single source of truth for your organization. As your product evolves, so too must your design system. The design system should be malleable enough to add updates easily, and you must have a clear process for making these updates.
There is clear ownership
Assign an owner to the design system. This could be a dedicated team with a product owner, an editor, or curator who works with content creators, or simply a single responsible person, but it needs to be clear who is responsible for keeping the design system up-to-date. If this becomes a burdensome responsibility, consider rotating this role on a regular basis every three months.
The system is actionable
Your design system is not just a library or museum for user interface elements. It should be a “widget factory” that can produce any interface element on demand. As each new element is added to the system, make it available for download in whatever formats your team will need. Ensure that not only the code is available but the graphical and wireframe assets, as well. This allows every designer to have a full palette of interface elements with which to create prototypes at any given time.
The system is accessible
Accessibility means that the design system is available to everyone in your organization. Accessible design systems are:
Easily found
Use a memorable URL and ensure that everyone is aware of it.
Easily distributed
Ensure that your teams can access it at their convenience (in the office, out of the office, on mobile devices, etc.).
Easy to search
A comprehensive and accurate search of the design system greatly increases its usage.
Easy to use
Treat this as you would any other design project. If it’s not usable, it will go unused very quickly.
What goes into a design system? -->

If it’s made of pixels, it goes into the design system. All interaction design elements should be defined and added to the design system. Use design patterns that work well in your existing product as the baseline of your design system. Form fields, labels, drop-down menus, radio button placement and behavior, Ajax and jQuery events, buttons—all of these should be included in the design system, as is illustrated in Figure 1-11, which shows the system for Salesforce.

Figure 1-11. If it’s made of pixels, it goes into the design system

Provide three data points for each interaction design element (see Figure 1-12):

What does the element look like?
Include detail about the minimum and maximum sizes of the element, vertical and horizontal constraints, and any styling demands on the element.
Where it’s usually placed on the screen
Make it clear if an element should be consistently placed in certain areas of the screen as well as any exceptions that might negate this design pattern.
When it should be used
It’s imperative that your team knows when to use a drop-down menu over a radio button and other factors that would determine the selection of one UI element in place of another.
Figure 1-12. A detail from the Westpac GEL design system

Next, include all visual design elements. Begin with the general color palette of your product. Ensure that each primary color is available with hex values along with complementary and secondary color choices. If certain elements, like buttons, for example, have different colors based on state, include this information in the description. Other elements to include here are logos, headers, footers, grid structures, and typographical choices (i.e., which fonts to use where and at what size/weight). The same attributes of what, where, and when provided for interaction design elements should also be provided here.

Finally, you need to codify copywriting styles, as well. Capture the tone of your brand, specific words you will and won’t use, grammatical choices, tolerated (and not) colloquialisms, along with button language (“OK,” “Yes,” “Go,” etc.) and other navigation language (previous/next, more/less, and so on).

Alternatives: The Wiki-Based Style Guide

Of course, not every team will have the wherewithal to create a design system. For teams that can’t justify the effort, you can still get a lot of value out of a wiki-based style guide. Here’s why:

  • Wikis are familiar places for developers. This means that getting your teammates in engineering to participate in this tool will not involve forcing them to learn a new tool or one that was purpose-built only for designers.

  • Wikis keep revision histories (good ones do anyway). This is crucial because there will be times when you might want to roll back updates to the UI. Revision histories keep you from having to recreate previous states of the style guide.

  • Wikis keep track of who changed what and provide commenting functionality. This is ideal for keeping a trail of what decisions were made, who made them, and what the rationale and even the discussion were for making that change. As you onboard new team members, this type of historical capture can bring them up to speed much faster, as well. In other words, wikis are your documentation.

Collaborating with Geographically Distributed Teams

Physical distance is one of the biggest challenges to strong collaboration. Some of the methods we’ve discussed in this chapter—especially Design Studio—become more difficult when a team isn’t all in the same location. But you still find ways to collaborate. Tools such as Skype, Google Hangouts, and Slack can provide teams with the means to collaborate in real time. Google Docs (including Google Draw) and purpose-built services like Mural.com allow teammates to collaborate on a document at the same time. Trello and wikis make it possible for teams to track information together. And a phone with a camera can make it easy to quickly share photos in an ad hoc way. All these tools can make cross-time-zone collaboration more effective and can help teams to feel virtually connected for long periods of time during the day.

Collaborative Design Sessions with Distributed Teams

Working on a geographically distributed team can make collaborative design more difficult. The benefits of collaboration are worth making the extra effort it takes to overcome the challenge of distance. Let’s take a look at how one team we worked with overcame a continent-wide separation and designed solutions together.

This team was spread into two groups in two cities: the product and user experience team was in New York and the development team was in Vancouver. Our goal was to run a Design Studio and affinity mapping session with the entire team.

Set up

We asked the two groups to gather in their individual conference rooms with their own laptops. Each conference room had a Mac in it with a location-specific Skype account (that is, it wasn’t a specific individual’s account—it was an “office” account). The two offices connected to each other via their office Skype accounts so that we could see each other as a group. This visual element was critical because it was the closest we could get to physically being in the same room.

We prepared a very brief (roughly 10 slides) setup presentation that explained the problem statement we were tackling. It included customer testimonials and data, and a very brief recap of our customers’ needs. The presentation also included the constraints of the solution space.

Priming the pump with affinity mapping

We kicked things off with an affinity mapping exercise. Typically, these are done by using sticky notes and a whiteboard. In this case, we used a shared Google Doc spreadsheet to conduct the exercise, as shown in Figure 1-13. We asked everyone in both offices to sign in to the shared spreadsheet. The spreadsheet itself had a column labeled for each person. Google Docs allows multiple editors to work in the same document. For this meeting, we had eight team members in the document at the same time!

We asked the team to come up with as many ideas as they could think of to solve the problem we presented. Each team member wrote one idea per cell in the column marked with that individual’s name. We gave the team five minutes to generate as many ideas as they could.

Next, to make sure everyone in each location was aware of all of the proposals, we asked the team members to read their ideas to the distributed team. Some ideas went by quickly, whereas others generated more discussion.

Figure 1-13. Using Google Sheets for an affinity mapping session with a distributed team

To simulate affinity grouping in the shared spreadsheet, one member of the team, serving as a facilitator, began a second sheet in the document using a personal laptop. The facilitator created some initial column headers in the second sheet that reflected recurring themes that emerged from discussion.

Then, we asked the team to group the ideas under the themes. Everyone moved their own ideas into the theme sheet, and people were free to add new themes if they felt their ideas didn’t fit into any of the existing themes. At the end of this process, we had created a spreadsheet filled with ideas that were sorted into themes. Some themes had just a pair of ideas; others had as many as eight.

Design Studio with remote teams

To set up for the next step, a Design Studio session, we tried to mimic a colocated version of the activity as much as possible. We provided paper and pens at each location. We created a dual-monitor setup in each conference room so that each room would be able to see the sketches on one monitor while still being able to see their teammates via Skype on the second monitor, as shown in Figure 1-14. We asked each team to use a phone to photograph their sketches and email them to everyone else. This helped connect the dialog and the artifact to the conversation.

Figure 1-14. Dual monitor setup during remote Design Studio

After that initial setup, we were able to proceed with the Design Studio process as normal. Team members were able to present their ideas to both rooms and to receive trans-continental critique. The two teams were able to refine their ideas together and were eventually able to converge on one idea to take forward.

Making Collaboration Work

Not every team will find that collaboration comes easily. Most of us begin our careers by developing our individual technical skills as designers, developers, and so on. And in many organizations, collaboration across disciplines is rare. So it’s no wonder that it can feel challenging.

One of the most powerful tools for improving collaboration is the Agile technique of the retrospective and the related practice of creating Team Working Agreements. Retrospectives are regularly scheduled meetings, usually held at the end of every sprint, in which the team takes an honest look back at the past sprint. They examine what went well, what went poorly, and what the team wants to improve. Usually, the team will select a few things to work on for the next sprint. We can think of no more powerful tool for improving collaboration than the regular practice of effective retrospectives.

A Team Working Agreement is a document that serves as a partner to the retrospective. It keeps track of how the team has chosen to work together. It’s a self-created, continuously updated rule book that the team agrees to follow. At each retrospective, the team should check in with their Working Agreement to see if they’re still following it and if they need to update it to include new agreements or remove old ones that no longer make sense.

Here’s an outline for what you should consider covering in your Team Working Agreements (we’ve made a copy of our favorite template available online at http://leanuxbook.com/links):

Process overview
What kind of process are we using? Agile? If so, what flavor? How long are our iterations?
Ceremonies
What rituals will the team observe? For example, when is stand-up each day? When do we hold planning meetings and demos?
Communication/Tools
What systems will we use to communicate and document our work? What is our project management tool? Where do we keep our assets?
Working hours
Who works where? When are folks in the office? If we’re in different locations, what accommodations will we make for time-zone differences?
Requirements and design
How do we handle requirements definition, story writing, and prioritization? When is a story ready for design? When is a design ready to be broken into stories?
Development
What practices have we settled on? Do we use pair programming? What testing style will we use? What methods will we use for source control?
Work-in-progress limits
What is our backlog and icebox size? What WIP limits exist in various stages of our process?
Deployment
What is our release cadence? How do we do story acceptance?

And, any additional agreements.

Wrapping Up

Collaborative design (Figure 1-15) is an evolution of the UX design process. In this chapter, we discussed how opening up the design process brings the entire team deeper into the project. We talked about how the low-fidelity drawings created in Design Studio sessions can help teams generate many ideas and then converge on a set that the entire team can get behind. We showed you practical techniques you can use to create shared understanding—the fundamental currency of Lean UX. Using tools like design systems, style guides, collaborative design sessions, Design Studio, and simple conversation, your team can build a shared understanding that allows them to move forward at a much faster pace than in traditional environments.

Figure 1-15. A team using collaborative design techniques

Now that we have all of our assumptions declared and our design hypotheses created, we can begin the learning process. In the next chapter, we cover the Minimum Viable Product and how to use it to plan experiments. We use those experiments to test the validity of our assumptions and decide how to move forward with our project.

1In the years since we published the first edition of this book, the Design Studio method has become increasingly popular. There are now two comprehensive guides to the method. If you want to go deeper than our coverage, see Design Sprint by Banfield, Lombardo, and Wax and Sprint by Knapp, Zeratsky, and Kowitz.

Continue reading Collaborative design fuels shared understanding—the fundamental currency of Lean UX.

Categories: Technology

Four short links: 9 August 2018

O'Reilly Radar - Thu, 2018/08/09 - 04:20

Music Money, Faster Webpages, Sampling Neurons, and Catching Deepfakes

  1. How Musicians Make Money (Or Don’t at All) in 2018 (Rolling Stone) -- When you end up tracing all the dollars, around 10% of it gets captured by the artist. That’s amazingly low.
  2. Why AMP? -- an interesting answer from Hacker News to this question: AMP doesn't support a lot of the crap that makes webpages slow, so it's a way to say "computer says no" to feature requests that would slow page load time. If you can find a better way to convince large organizations that page load speed is a valuable metric, and more important than whatever other resource they want to load today, I'd love to hear it. But from what I've seen, AMP is the only thing that's had any success in tackling this problem.
  3. Neuropixels -- most electrode arrays are built in academic foundries and house 64 sensors in a 1,050-square-micron device. Neuropixels were designed and manufactured in a foundry called Imec, owned by the Flemish government. The Imec probe packs nearly 1,000 recording sites onto a single shank about 1,400 microns square and 10 millimeters long, which spans the full depth of a rat brain.
  4. DARPA's First Tools For Catching Deepfakes -- via the Media Forensics program. Others involved in the DARPA challenge are exploring similar tricks for automatically catching deepfakes: strange head movements, odd eye color, and so on. “We are working on exploiting these types of physiological signals that, for now at least, are difficult for deepfakes to mimic,” says Hany Farid, a leading digital forensics expert at Dartmouth University. I do hope their "what is not fake" data set includes non-neurotypical people.

Continue reading Four short links: 9 August 2018.

Categories: Technology

Four short links: 8 August 2018

O'Reilly Radar - Wed, 2018/08/08 - 04:10

AI Patenting, Data Viz Errors, Developer Tool, and Origami-hand

  1. AI Patenting Up -- Facebook filed for 55 patents related to machine learning or neural networks in 2016, up from zero in 2010. IBM, which has been granted more U.S. patents than any other company for the past 25 years running, boasts that in 2017 it won 1,400 AI-related patents, more than ever before.
  2. Data Visualization Don'ts -- so good.
  3. Luna Studio -- a developer’s whiteboard on steroids. Design, prototype, develop, and refactor any application simply by connecting visual elements together. Collaborate with co-workers, interactively fine tune parameters, inspect the results, and visually profile the performance in real time. I'm very interested in IDE and tool improvements; they're force multipliers for programmers.
  4. Origami-hand -- Origami-hand is a disposable robot hand that folds and assembles paper. We aim to expand the application range of robots by realizing hands that can perform complicated operations at low cost. (via IEEE Spectrum)

Continue reading Four short links: 8 August 2018.

Categories: Technology

Pages

Subscribe to LuftHans aggregator - Technology