Blurring sensitive information no longer keeps it safe on the Internet

first_imgWe have long relied upon simple image manipulation like blurring and pixelation to obscure sensitive information on the internet, but that may not work for much longer. Researchers from the University of Texas at Austin and Cornell Tech have developed a machine learning system that can identify faces and text in images with alarming accuracy. And it wasn’t even very hard to do.The premise of blurring and pixelation comes from a very human perspective. If we can’t see through the filter, it’s considered good enough. Humans are no longer the pattern recognition champions, though. The researchers trained a neural network with images of faces and text and found it could often correctly identify the images again after they had been obfuscated with three different techniques — YouTube’s proprietary blur tool, standard mosaic pixelation, and the P3 algorithm. It doesn’t technically de-blur the image for humans to see, but it can identify the content of the image by matching it to the original.The most striking aspect of this approach is simply how easy it was to implement. The researchers used Torch, an open source machine learning framework. The training data of faces and text was also publicly available. The team fed the images into the neural network until it could recognize them at greater than 90% accuracy, then the images were obfuscated with the three methods listed above.For some data sets, the neural network was able to correctly identify the YouTube blurred image with 80 or 90% accuracy. With mosaic processing, the computer was able to identify the most aggressively pixelated images with 50-75% accuracy. The lowest accuracy was seen with P3 (that’s Privacy-Preserving Photo Sharing). This encrypts identifying data in a JPEG but leaves other data components intact. In this case, the network was only able to get it right 17% of the time. That’s a little higher than random chance.You don’t yet need to worry that every pixelated photo you’ve posted online is being decoded, but that day might not be far off. This was an extremely simple neural network composed of off-the-shelf software. Someone with more time and resources may be able to do much better. Going forward, you might want to simply remove all sensitive information from images with a solid black bar. That’ll stop the machines. For now, at least.last_img

Be the first to comment

Leave a comment

Your email address will not be published.


*