The Dressler Blog

I have opinions. Lots of opinions.

Back

Machine Learning: Correlation and Causation Whether you call it machine learning or artificial intelligence, revolutionary pattern recognition technology is taking the business world by storm. From healthcare to fintech, not a week goes by when a company doesn’t announce some new AI initiative. Many of these companies have spent the last fifteen years amassing huge amounts of data with little immediate idea how to parse that data. They are naturally eager to unleash powerful, new machine learning technologies to ferret out patterns in the data that might lead to higher sales, new cures, or hidden advantages. Unfortunately, AI has developed a reputation for yielding “objective” results free of human bias and error. But all data sets yield random, yet statistically significant patterns. The larger the data set, the more spurious correlations you are likely to see. Statistical significance is easier to achieve with a larger N. This leads to the publication of supposed insights harvested through machine learning from some company’s data. Pancreatic cancer is correlated with wearing boxers. Purchasing a new car is correlated with rain on Wednesdays. It’s all just the normal, random noise that large number sets generate. Why does this matter? It is worthwhile to return to Nassim N. Taleb’s opinion piece “Beware the Big Errors of Big Data” from 2013. Although he isn’t specifically discussing machine learning, his points still apply. There is a reason (most) academics don’t simply query large data sets looking for random correlations to publish and promote. Scientists are aware that such an approach is contrary to the scientific method. It’s just fishing. While these correlations may serve as the basis for further, focused experiments, they are rightly considered just statistical noise until they are confirmed independently. So how might a business with big data and a neural net avoid reacting to statistical noise? One simple way is to split your big data set randomly in half. Run your machine learning on one half of the data set and then see if the resulting correlations are confirmed by the other half. It’s not perfect (there may be bias in your collection methods), but it’s better than jumping at random correlations. In a nutshell: Correlation without causation is still a danger in an AI world. Read More What is a Quantum Computing Simulator? Microsoft wants in on the quantum computing revolution. Quantum computing is the predicted “next big thing” that will come after the current “next big thing” which is machine learning. Or perhaps it’s augment reality. Next Big Things are notoriously slippery. Unfortunately for Microsoft, they don’t actually have a quantum computer. Not to worry though. The original masters of vaporware have released a new quantum computing language that coincidentally works with their own Visual Studio and a handy quantum simulator on which to test your code. But what the heck is this “quantum simulator” thing? First, we need to understand that a real quantum computer has qubits, which are individual computing units maintained in a quantum state by extraordinary methods like extreme cooling. These qubits are able through quantum superposition and entanglement to perform calculations far beyond the power of traditional binary computer bits. Because of the multiple states suggested by quantum superposition, adding qubits increasing computing power exponentially. While adding bits to a classic computer merely increases its computing power linearly. A quantum computing simulator uses standard computing chips to simulate the imagined behavior of a quantum computer. So why bother to build a real quantum computer if a simulator can do the same thing? First, a simulator is merely an approximation of quantum behavior. It’s just a good guess for how a quantum computer might act. Second, exponential increases in the power of a true quantum computer means that standard computers can only approximate up to a point. Add too many qubits to your model and no supercomputer in the world can keep up. (Computer scientists estimate that 49 qubits is the tipping point. The Microsoft simulator claims to scale to 42 virtual qubits.) Why does this matter? Simulators are useful. Quantum computing will be a fundamental change. It pays to begin to practice the new computing methods that quantum systems will demand. However, quantum computers are pretty rare. Most of us aren’t going to get time on an actual quantum computer. So a simulator is a pretty good second option. But we shouldn’t confuse simulation with reality. Superposition and entanglement are tricky for theoretical physicists. Simulations of actions and interactions are going to be blunt instruments compared to the real thing. In a nutshell: The more theoretical qubits in your simulator, the less likely it will mirror actual quantum behavior. Read More Disrupting Disruption Every week yields hundreds of breathless press releases. Industry after industry is about to be disrupted by an exciting new company through the magic of AI, blockchain, augmented reality, quantum computing, the sharing economy, et cetera, et cetera, et cetera. TechCrunch dutifully reports each company’s dubious claims with world-weary directness. It’s all kinda dumb. Karen Wickre writing in Wired this week takes on purveyors of “the D word.” Her point: disruption is extremely unlikely. Most legacy industries chug along with near efficiency. That is, after all, the promise of capitalism. So, with the exhausted patience of a middle school English teacher, she urges us to stop using hyperbolic language to describe modest innovation. Why does this matter? Words have meetings. When an industry is disrupted, it is rendered unrecognizable to itself. If I open up a juice bar, no disruption is imminent. If I open up a juice bar with a custom built flavor-optimization AI, I’m still not disrupting anything. It’s still just juice. Hyperbole is the quick-hit heroin of public relations. Readers engage when they hear that their industry is about to be disrupted. But hyperbole loses its effectiveness with time, forcing startup founders and their marketing enablers to engage in higher and higher doses. Eventually, you have a vending machine that’s going to change the world. In a nutshell: Disruption is only evident after the fact. Read More

Sign up to receive weekly Uneven Distribution emails about technology, design, marketing, and user experience.