Safety Sold Separately

Steven D Marlow
5 min readMar 6, 2021


There are a lot of people thinking about AI Safety, which is fine, but just like the AI Ethics movement, there seems to be a disconnect between “make it safe” and what that actually means in terms of real-world design. Before taking a stroll down that path, it’s important to take a step back and address the issue that kicked-off the recent debate.


So now that we have that out of the way… OK, it was their new prize pig called SEER (Seer?) that had access to a billion Instagram posts and did the usual Deep Learning thing of finding patterns in each image and then grouped like-for-like into folders. The new trick is that, instead of humans having to label millions of images ahead of time, they only had to label some of those folders after the “sorting” was done. Does label data actually play a role in how the algorithm works? Not really. Post rather than pre-labeling just allows for “self-supervised” training at a much larger scale. It seems closer to one-shot learning than reinforcement learning*, but I honestly don’t care because it’s not “real” AI research.

The question became one of, is it intelligence, or is it just parroting. The fact that label data still needed to be added by humans would suggest no progress was made, and that these systems can’t pull context or meaning out of thin air. Humans provide a level of “labeled data” for their children, but that is not the same as having the cognitive capacity to correlate those labels with objects, nor is it a prime function of how babies and toddlers build models of the world (for the first few years of mental development, humans are language agnostic).

And then even talking about AI’s being “intelligent” was tossed aside to bring safety, performance, and robustness into the debate. If we stick with the idea that SEER is just a glorified sorting algorithm, performance is just a matter of compute hardware, and robustness is moot (though I’m sure there will be some language variant in the news before the end of the year). All of this leads to the safety issues for this kind of system. Or any system.

An electrical breaker has a physical form, and has a single function. To make it “safer” you need to have a safety feature, that also has a physical form, and then a method or design that integrates them together. Machine Learning can be an electrical breaker, or some other device, that operates in a particular way from a particular structure. Yes, the analogy isn’t perfect. ML/DL is just code and data, but the thing you want this “program” to do requires some kind of form in order to operate.

Sorting images from Instagram makes that data set the form while the method leading to an output is your feature. You don’t make it “safer” by looking at more images, or sorting into more categories. If by safe you mean, don’t save images of children, then you need that function to be constructed and integrated into the current sorting design. This doesn’t work so well because your ML filter isn’t good enough to work internally as a safety measure. It is not external to the problem (which is what adding a “child” label to the filtered image folder would require, despite having to first do the thing you are trying NOT to do for safety reasons).

“Safe” requires context, and an interface. Any idea that “safe” can be built into all AI projects to prevent unforeseen problems just means someone hasn’t thought about implementation, or even tried it on a few examples. “Safe” could mean not even using SEER or some other AI system in the first place. Being safe, like being ethical, happens in hindsight. For every warning label about how hot the coffee is, or that toaster don’t like to take baths, there was a patient zero. A Darwin Award winner. Everything that seems so obvious, now, was not always the way.

Then there is this gem. Observation is not participation. It really goes back to the question of intelligence, or at least whether current systems have the ability to participate in their actions. I suggest that, no, they merely observe. Finding some interesting pattern in a group of related images only implies an understanding of what the pattern represents (and we know that cloudy or not-cloudy can play a hidden role in what is observed). Safety features are post-observation events, and ML depends on humans for all of that.

Using the ‘don’t save images of children’ as an example, the algorithm has no way to participate. It can only observe a known pattern. That it is a child, and that it is not to be stored, are external activities. And again, just to train the system on “child” is going to require saving images first. Can you make the safety feature work at the observation level? If we ignore the training part, perhaps it’s enough to replace the ‘child’ folder with null or delete action. The algorithm wouldn’t know saved from not saved; it’s just focused on sorting.

Safety is just a layer built on top of the tools we use. AI “safety” is really just what policy do you follow in regards to the output. Not sure how one would contort themselves to automate the pre-screening of data. Train a system on child images and use it to filter child images from the “actual” system so it never uses those images as training data? Perhaps self-supervised will be replaced in favor of “preemptive.” That does seem to be the intended use case, where the observation, participation, and “safety action” happen automatically. Some future ‘IMP’ AI that has, as one of its policies, to not live stream active shootings. To understand what it is seeing, and in the context of directives, requires more than billions of images or thousands of hours of training.

Such a system needs to be artificially intelligent.

*Just to expand on the one-shot vs reinforcement idea: SEER seems to one-shot the billion images then does the “reinforcement” work by looking for closely matched weight distributions in the network. Sort of an additive method in post. The traditional training label data method would sort images into the groups where all combined would result in a single distribution of weights.



Steven D Marlow

I'm applying for the mad scientist position. Have robot. Will travel.