The tax collection agency will transition away from using a service from the authentication service ID.me amid bipartisan backlash.
The geopolitical consequences may be radical.
Saying it wants “to find the right balance” with the technology, the social network will delete the face scan data of more than one billion users.
To protect children online, more companies and governments are forcing users to prove how old they are.
Berlin-based Mobius Labs has closed a €5.2 million (~$6.1M) funding round off the back of increased demand for its computer vision training platform. The Series A investment is led by Ventech VC, along with Atlantic Labs, APEX Ventures, Space Capital, Lunar Ventures plus some additional angel investors.
The startup offers an SDK that lets the user create custom computer vision models fed with a little of their own training data — as an alternative to off-the-shelf tools which may not have the required specificity for a particular use-case.
It also flags a ‘no code’ focus, saying its tech has been designed with a non-technical user in mind.
As it’s an SDK, Mobius Labs’ platform can also be deployed on premise and/or on device — rather than the customer needing to connect to a cloud service to tap into the AI tool’s utility.
“Our custom training user interface is very simple to work with, and requires no prior technical knowledge on any level,” claims Appu Shaji, CEO and chief scientist.
“Over the years, a trend we have observed is that often the people who get the maximum value from AI are non technical personas like a content manager in a press and creative agency, or an application manager in the space sector. Our no-code AI allows anyone to build their own applications, thus enabling these users to get close to their vision without having to wait for AI experts or developer teams to help them.”
Mobius Labs — which was founded back in 2018 — now has 30 customers using its tools for a range of use cases.
Uses include categorisation, recommendation, prediction, reducing operational expense, and/or “generally connecting users and audiences to visual content that is most relevant to their needs”. (Press and broadcasting and the stock photography sector have unsurprisingly been big focuses to date.)
But it reckons there’s wider utility for its tech and is gearing up for growth.
It caters to businesses of various sizes, from startups to SMEs, but says it mainly targets global enterprises with major content challenges — hence its historical focus on the media sector and video use cases.
Now, though, it’s also targeting geospatial and earth observation applications as it seeks to expand its customer base.
The 30-strong startup has more than doubled in size over the last 18 months. With the new funding it’s planning to double its headcount again over the next 12 months as it looks to expand its geographical footprint — focusing on Europe and the US.
Year-on-year growth has also been 2x but it believes it can dial that up by tapping into other sectors.
“We are working with industries that are rich in visual data,” says Shaji. “The geospatial sector is something that we are focussing on currently as we have a strong belief that vast amounts of visual data is being produced by them. However, these huge archives of raw pixel data are useless on their own.
“For instance, if we want to track how river fronts are expanding, we have to look at data collected by satellites, sort and tag them in order to analyse them. Currently this is being done manually. The technology we are creating comes in a lightweight SDK, and can be deployed directly into these satellites so that the raw data can be detected and then analysed by machine learning algorithms. We are currently working with satellite companies in this sector.”
“We realise these are the big players but at the same time believe that we have something unique to offer, which these players cannot: Unlike their solutions, our platform users can be outside the field of computer vision. By democratising the training of machine learning models beyond simply the technical crowd, we are making computer vision accessible and understandable by anyone, regardless of their job titles,” he argues.
“Another core value that differentiates us is the way we treat client data. Our solutions are delivered in the form of a Software Development Kit (SDK), which runs on-premise, completely locally on clients’ systems. No data is ever sent back to us. Our role is to empower people to build applications, and make them their own.”
Computer vision startups have been a hot acquisition target in recent years and some earlier startups offering ‘computer vision as a service’ got acquired by IT services firms to beef up their existing offerings, while tech giants like Amazon and (the aforementioned) Google offer their own computer vision services too.
But Shaji suggests the tech is now at a different stage of development — and primed for “mass adoption”.
“We’re talking about providing solutions that empower clients to build their own applications,” he says, summing up the competitive play. “And that [do that] with complete data privacy, where our solutions run on-premise, and we don’t see our clients data. Coupled with that is the ease of use that our technology offers: It is a lightweight solution that can be deployed on many ‘edge’ devices like smartphones, laptops, and even on satellites.”
Commenting on the funding in a statement, Stephan Wirries, partner at Ventech VC, added: “Appu and the team at Mobius Labs have developed an unparalleled offering in the computer vision space. Superhuman Vision is impressively innovative with its high degree of accuracy despite very limited required training to recognise new objects at excellent computational efficiency. We believe industries will be transformed through AI, and Mobius Labs is the European Deep Tech innovator teaching machines to see.”
Facebook called it “an unacceptable error.” The company has struggled with other issues related to race.
It hasn’t even been a week since Tesla hosted its AI Day, a live-streamed event full of technical jargon meant to snare the choicest of AI and vision engineers to come work for Tesla and help the company achieve autonomous greatness, and already CEO Elon Musk is coming in with some hot takes about the “Full Self-Driving” (FSD) tech.
In a tweet on Tuesday, Musk said: “FSD Beta 9.2 is actually not great imo, but Autopilot/AI team is rallying to improve as fast as possible. We’re trying to have a single tech stack for both highway & city streets, but it requires massive [neural network] retraining.”
This is an important point. Many others in the autonomous space have mirrored this sentiment. Don Burnette, co-founder and CEO of Kodiak Robotics, says his company is exclusively focused on trucking for the moment because it’s a much easier problem to solve. In a recent ExtraCrunch interview, Burnette said:
One of the unique aspects of our tech is that it’s highly customized for a specific goal. We don’t have this constant requirement that we maintain really high truck highway performance while at the same time really high dense urban passenger car performance, all within the same stack and system. Theoretically it’s certainly possible to create a generic solution for all driving in all conditions under all form factors, but it’s certainly a much harder problem.
Because Tesla is only using optical cameras, scorning lidar and radar, “massive” neural network training as a requirement is not an understatement at all.
Despite the sympathy we all feel for the AI and vision team that may undoubtedly be feeling a bit butthurt by Musk’s tweet, this is a singular moment of clarity and honesty for Musk. Usually, we have to filter Tesla news about its autonomy with a fine-tuned BS meter, one that beeps wildly with every mention of its “Full Self-Driving” technology. Which, for the record, is not at all full self-driving; it’s just advanced driver assistance that could, we grant, lay the groundwork for better autonomy in the future.
Musk followed up the tweet by saying that he just drove the FSD Beta 9.3 from Pasadena to LAX, a ride that was “much improved!” Do we buy it? Musk is ever the optimist. At the start of the month, Musk said Tesla would be releasing new versions of its FSD every two weeks at midnight California time. Then he promised that Beta 9.2 would be “tight,” saying that radar was holding the company back and now that it’s fully accepted pure vision, progress will go much faster.
Perhaps Musk is just trying to deflect against the flurry of bad press about the FSD system. Last week, U.S. auto regulators opened a preliminary investigation into Tesla’s Autopilot, citing 11 incidents in which vehicles crashed into parked first responder vehicles. Why first responder vehicles in particular, we don’t know. But according to investigation documents posted on the National Highway Traffic and Safety Administration’s website, most of the incidents took place after dark. Poor night vision is definitely a thing with many human drivers, but those kinds of incidents just won’t fly in the world of autonomous driving.
Elon Musk wants Tesla to be seen as “much more than an electric car company.” On Thursday’s Tesla AI Day, the CEO described Tesla as a company with “deep AI activity in hardware on the inference level and on the training level” that can be used down the line for applications beyond self-driving cars, including a humanoid robot that Tesla is apparently building.
Tesla AI Day, which started after a rousing 45 minutes of industrial music pulled straight from “The Matrix” soundtrack, featured a series of Tesla engineers explaining various Tesla tech with the clear goal of recruiting the best and brightest to join Tesla’s vision and AI team and help the company go to autonomy and beyond.
“There’s a tremendous amount of work to make it work and that’s why we need talented people to join and solve the problem,” said Musk.
Like both “Battery Day” and “Autonomy Day,” the event on Thursday was streamed live on Tesla’s YouTube channel. There was a lot of super technical jargon, but here are the top four highlights of the day.
Tesla Bot: A definitely real humanoid robot
This bit of news was the last update to come out of AI Day before audience questions began, but it’s certainly the most interesting. After the Tesla engineers and executives talked about computer vision, the Dojo supercomputer and the Tesla chip (all of which we’ll get to in a moment), there was a brief interlude where what appeared to be an alien go-go dancer appeared on the stage, dressed in a white body suit with a shiny black mask as a face. Turns out, this wasn’t just a Tesla stunt, but rather an intro to the Tesla Bot, a humanoid robot that Tesla is actually building.
When Tesla talks about using its advanced technology in applications outside of cars, we didn’t think he was talking about robot slaves. That’s not an exaggeration. CEO Elon Musk envisions a world in which the human drudgery like grocery shopping, “the work that people least like to do,” can be taken over by humanoid robots like the Tesla Bot. The bot is 5’8″, 125 pounds, can deadlift 150 pounds, walk at 5 miles per hour and has a screen for a head that displays important information.
“It’s intended to be friendly, of course, and navigate a world built for humans,” said Musk. “We’re setting it such that at a mechanical and physical level, you can run away from it and most likely overpower it.”
Because everyone is definitely afraid of getting beat up by a robot that’s truly had enough, right?
The bot, a prototype of which is expected for next year, is being proposed as a non-automotive robotic use case for the company’s work on neural networks and its Dojo advanced supercomputer. Musk did not share whether the Tesla Bot would be able to dance.
Unveiling of the chip to train Dojo
Tesla director Ganesh Venkataramanan unveiled Tesla’s computer chip, designed and built entirely in-house, that the company is using to run its supercomputer, Dojo. Much of Tesla’s AI architecture is dependent on Dojo, the neural network training computer that Musk says will be able to process vast amounts of camera imaging data four times faster than other computing systems. The idea is that the Dojo-trained AI software will be pushed out to Tesla customers via over-the-air updates.
The chip that Tesla revealed on Thursday is called “D1,” and it contains a 7 nm technology. Venkataramanan proudly held up the chip that he said has GPU-level compute with CPU connectivity and twice the I/O bandwidth of “the state of the art networking switch chips that are out there today and are supposed to be the gold standards.” He walked through the technicalities of the chip, explaining that Tesla wanted to own as much of its tech stack as possible to avoid any bottlenecks. Tesla introduced a next-gen computer chip last year, produced by Samsung, but it has not quite been able to escape the global chip shortage that has rocked the auto industry for months. To survive the shortage, Musk said during an earnings call this summer that the company had been forced to rewrite some vehicle software after having to substitute in alternate chips.
Aside from limited availability, the overall goal of taking the chip production in-house is to increase bandwidth and decrease latencies for better AI performance.
“We can do compute and data transfers simultaneously, and our custom ISA, which is the instruction set architecture, is fully optimized for machine learning workloads,” said Venkataramanan at AI Day. “This is a pure machine learning machine.”
Venkataramanan also revealed a “training tile” that integrates multiple chips to get higher bandwidth and an incredible computing power of 9 petaflops per tile and 36 terabytes per second of bandwidth. Together, the training tiles compose the Dojo supercomputer.
To Full Self-Driving and beyond
Many of the speakers at the AI Day event noted that Dojo will not just be a tech for Tesla’s “Full Self-Driving” (FSD) system, it’s definitely impressive advanced driver assistance system that’s also definitely not yet fully self-driving or autonomous. The powerful supercomputer is built with multiple aspects, such as the simulation architecture, that the company hopes to expand to be universal and even open up to other automakers and tech companies.
“This is not intended to be just limited to Tesla cars,” said Musk. “Those of you who’ve seen the full self-driving beta can appreciate the rate at which the Tesla neural net is learning to drive. And this is a particular application of AI, but I think there’s more applications down the road that will make sense.”
Musk said Dojo is expected to be operational next year, at which point we can expect talk about how this tech can be applied to many other use cases.
Solving computer vision problems
During AI Day, Tesla backed its vision-based approach to autonomy yet again, an approach that uses neural networks to ideally allow the car to function anywhere on earth via its “Autopilot” system. Tesla’s head of AI, Andrej Karpathy, described Tesla’s architecture as “building an animal from the ground up” that moves around, senses its environment and acts intelligently and autonomously based on what it sees.
“So we are building of course all of the mechanical components of the body, the nervous system, which has all the electrical components, and for our purposes, the brain of the autopilot, and specifically for this section the synthetic visual cortex,” he said.
Karpathy illustrated how Tesla’s neural networks have developed over time, and how now, the visual cortex of the car, which is essentially the first part of the car’s “brain” that processes visual information, is designed in tandem with the broader neural network architecture so that information flows into the system more intelligently.
The two main problems that Tesla is working on solving with its computer vision architecture are temporary occlusions (like cars at a busy intersection blocking Autopilot’s view of the road beyond) and signs or markings that appear earlier in the road (like if a sign 100 meters back says the lanes will merge, the computer once upon a time had trouble remembering that by the time it made it to the merge lanes).
To solve for this, Tesla engineers fell back on a spatial recurring network video module, wherein different aspects of the module keep track of different aspects of the road and form a space-based and time-based queue, both of which create a cache of data that the model can refer back to when trying to make predictions about the road.
The company flexed its over 1,000-person manual data labeling team and walked the audience through how Tesla auto-labels certain clips, many of which are pulled from Tesla’s fleet on the road, in order to be able to label at scale. With all of this real-world info, the AI team then uses incredible simulation, creating “a video game with Autopilot as the player.” The simulations help particularly with data that’s difficult to source or label, or if it’s in a closed loop.
Background on Tesla’s FSD
At around minute forty in the waiting room, the dubstep music was joined by a video loop showing Tesla’s FSD system with the hand of a seemingly alert driver just grazing the steering wheel, no doubt a legal requirement for the video after investigations into Tesla’s claims about the capabilities of its definitely not autonomous advanced driver assistance system, Autopilot. The National Highway Transportation and Safety Administration earlier this week said they would open a preliminary investigation into Autopilot following 11 incidents in which a Tesla crashed into parked emergency vehicles.
A few days later, two U.S. Democratic senators called on the Federal Trade Commission to investigate Tesla’s marketing and communication claims around Autopilot and the “Full Self-Driving” capabilities.
Tesla released the beta 9 version of Full Self-Driving to much fanfare in July, rolling out the full suite of features to a few thousand drivers. But if Tesla wants to keep this feature in its cars, it’ll need to get its tech up to a higher standard. That’s where Tesla AI Day comes in.
“We basically want to encourage anyone who is interested in solving real-world AI problems at either the hardware or the software level to join Tesla, or consider joining Tesla,” said Musk.
And with technical nuggets as in-depth as the ones featured on Thursday plus a bumping electronic soundtrack, what red-blooded AI engineer wouldn’t be frothing at the mouth to join the Tesla crew?
You can watch the whole thing here:
VOCHI, a Belarus-based startup behind a clever computer vision-based video editing app used by online creators, has raised an additional $2.4 million in a “late-seed” round that follows the company’s initial $1.5 million round led by Ukraine-based Genesis Investments last year. The new funds follow a period of significant growth for the mobile tool, which is now used by over 500,000 people per month and has achieved a $4 million-plus annual run rate in a year’s time.
Investors in the most recent round include TA Ventures, Angelsdeck, A.Partners, Startup Wise Guys, Kolos VC, and angels from other Belarus-based companies like Verv and Bolt. Along with the fundraise, VOCHI is elevating the company’s first employee, Anna Bulgakova, who began as head of marketing, to the position of co-founder and Chief Product Officer.
According to VOCHI co-founder and CEO lya Lesun, the company’s idea was to provide an easy way for people to create professional edits that could help them produce unique and trendy content for social media that could help them stand out and become more popular. To do so, VOCHI leverages a proprietary computer-vision-based video segmentation algorithm that applies various effects to specific moving objects in a video or to images in static photos.
“To get this result, there are two trained [convolutional neural networks] to perform semi-supervised Video Object Segmentation and Instance Segmentation,” explains Lesun, of VOCHI’s technology. “Our team also developed a custom rendering engine for video effects that enables instant application in 4K on mobile devices. And it works perfectly without quality loss,” he adds. It works pretty fast, too — effects are applied in just seconds.
The company used the initial seed funding to invest in marketing and product development, growing its catalog to over 80 unique effects and more than 30 filters.
Today, the app offers a number of tools that let you give a video a particular aesthetic (like a dreamy vibe, artistic feel, or 8-bit look, for example). It can also highlight the moving content with glowing lines, add blurs or motion, apply different filters, insert 3D objects into the video, add glitter or sparkles, and much more.
In addition to editing their content directly, users can swipe through a vertical home feed in the app where they can view the video edits others have applied to their own content for inspiration. When they see something they like, they can then tap a button to use the same effect on their own video. The finished results can then be shared out to other platforms, like Instagram, Snapchat and TikTok.
Though based in Belarus, most of VOCHI’s users are young adults from the U.S. Others hail from Russia, Saudi Arabia, Brazil and parts of Europe, Lesun says.
Unlike some of its video editor rivals, VOCHI offers a robust free experience where around 60% of the effects and filters are available without having to pay, along with other basic editing tools and content. More advanced features, like effect settings, unique presents and various special effects require a subscription. This subscription, however, isn’t cheap — it’s either $7.99 per week or $39.99 for 12 weeks. This seemingly aims the subscription more at professional content creators rather than a casual user just looking to have fun with their videos from time to time. (A one-time purchase of $150 is also available, if you prefer.)
To date, around 20,000 of VOCHI’s 500,000 monthly active users have committed to a paid subscription, and that number is growing at a rate of 20% month-over-month, the company says.
The numbers VOCHI has delivered, however, aren’t as important as what the startup has been through to get there.
The company has been growing its business at a time when a dictatorial regime has been cracking down on opposition, leading to arrests and violence in the country. Last year, employees from U.S.-headquartered enterprise startup PandaDoc were arrested in Minsk by the Belarus police, in an act of state-led retaliation for their protests against President Alexander Lukashenko. In April, Imaguru, the country’s main startup hub, event and co-working space in Minsk — and birthplace of a number of startups, including MSQRD, which was acquired by Facebook — was also shut down by the Lukashenko regime.
Meanwhile, VOCHI was being featured as App of the Day in the App Store across 126 countries worldwide, and growing revenues to around $300,000 per month.
“Personal videos take an increasingly important place in our lives and for many has become a method of self-expression. VOCHI helps to follow the path of inspiration, education and provides tools for creativity through video,” said Andrei Avsievich, General Partner at Bulba Ventures, where VOCHI was incubated. “I am happy that users and investors love VOCHI, which is reflected both in the revenue and the oversubscribed round.”
The additional funds will put VOCHI on the path to a Series A as it continues to work to attract more creators, improve user engagement, and add more tools to the app, says Lesun.
Netradyne, a startup that uses cameras and edge computing to improve commercial driver safety, has scored $150 million in Series C funding. The fresh cash will help the company double down on its current product, Driveri, according to Avneesh Agrawal, CEO and co-founder.
Earlier this year, Netradyne partnered with Amazon to install its hardware and software in its delivery vehicles. The tech giant has faced accusations that it puts speed and efficiency over driver safety, all the while avoiding liability for accidents by employing third-party firms.
Other companies may not have that same morally dubious luxury, which makes Netradyne’s service all the more relevant for fleets. Commercial auto insurance rates are expected to climb 14.2% in 2021, in large part due to distracted drivers using smartphones, which has increased the number of accidents resulting in death, according to a report by insurance company Alera Group. The study also found the cost of repairing modern vehicles and medical costs continue rising at rates higher than inflation. Fleet managers looking to cut costs might be lured by promises of safer driving behavior.
“Nuclear verdicts, in which judgements exceed $10 million, have gone up by almost 500% according to some statistics,” Agrawal told TechCrunch. “It’s becoming the biggest expense for commercial fleets, pretty much after the drivers and fuel. There are a lot of commercial insurance carriers actually going out of business, or they’re passing on the risk to the fleets.”
If Agrawal is to be believed, Netradyne’s service is very much in demand, with subscribers and annual recurring revenue increasing three times in 2020. The CEO would not share the base, but he did say Netradyne has over 1,000 customers today.
Netradyne has an agreement with National Interstate Insurance that subsidizes the company’s product, but generally Netradyne sells to a fleet. The pitch is that the fleet should see a reduction in accidents and can then take that data to insurance companies to negotiate better claims.
Netradyne doesn’t provide an average of how its cameras and software has made driving safer, but anecdotally, Agrawal said a couple of the companies that have used the product saw claims decrease by up to 80% in a year.
So how does it work?
Netradyne, which combines netra which means “vision” in Sanskrit and dyne which is a unit of power or force in Greek, has built a full stack system that is purely vision-based, according to Agrawal. That means cameras in simpler terms. The system comes in two form factors. The D-210, built for small-to-medium-sized vehicles is a dual-facing dashcam featuring both an inward and outward-facing camera, recording both the driver and the road. The D-410 has four HD cameras providing a 360-degree picture, which includes two side window views, and is better suited for heavy duty vehicles.
The cameras pick up anything from a driver being cut off and correctly slowing down to create space between the vehicles to a driver being distracted by texting. A device that connects to the cloud is on board the vehicle, and it’s on the edge of that device that real-time computations are done, which might result in the driver getting feedback and automated suggestions like “please slow down” or “distracted driving.”
“Most importantly, we track the positive driving behavior because we want to change the discussion with the drivers,” said Agrawal. “Drivers are so used to being penalized, and in most cases, it’s actually after the incident has happened or based on a customer’s complaint. This is very proactive and it’s positive.”
In the moment, rewarding behavior can look like a notification to the driver, giving them a little dopamine hit that might encourage continued good driving. Drivers are rewarded with DriverStars, an attempt to gamify commercial operations by encouraging them to rack up points. Those points may be converted to bonuses or other incentives.
“The drivers are the biggest assets for the fleet, and traditionally, if you ask the fleets, who are your worst drivers, they’ll tell you who because they are the ones who got into accidents, who customers have complained about,” said Agrawal. “If you ask them who are your safer drivers, they can’t really tell, but in our situation because we micro identify not just the drivers who haven’t gotten into accidents, but also drivers are actually being proactive with safe driving behaviour, fleets can focus on those drivers and create retention packages, give them incentives, make them into managers and leadership positions.”
Of course, there’s another upside to all this data collection on driver behavior. Agrawal says his company collects about 700 million miles per month worth of data, analyzing it to identify every potential scenario a driver can get themselves into. And it’s all being done on the edge, which is an experiment in and of itself.
“Investing in autonomous driving is definitely a possibility, but it’s not our focus right now,” said Agrawal.
This Series C round was led by SoftBank Vision Fund 2. Existing investors Point72 Ventures and M12 also participated in the round, bringing Netradyne’s total funding to over $197 million. Agrawal told TechCrunch the company aims to make $100 million in revenue by the end of the year.
A sample video shows how computer vision (running on an external computer) detects the enemy and calculates how far the mouse needs to move to target that enemy.
When it comes to the cat-and-mouse game of stopping cheaters in online games, anti-cheat efforts often rely in part on technology that ensures the wider system running the game itself isn’t compromised. On the PC, that can mean so-called “kernel-level drivers” which monitor system memory for modifications that could affect the game’s intended operation. On consoles, that can mean relying on system-level security that prevents unsigned code from being run at all (until and unless the system is effectively hacked, that is).
But there’s a growing category of cheating methods that can now effectively get around these forms of detection in many first-person shooters. By using external tools like capture cards and “emulated input” devices, along with machine learning-powered computer vision software running on a separate computer, these cheating engines totally circumvent the secure environments set up by PC and console game makers. This is forcing the developers behind these games to look to alternate methods to detect and stop these cheaters in their tracks.
How it works
The basic toolchain used for these external emulated-input cheating methods is relatively simple. The first step is using an external video capture card to record a game’s live output and instantly send it to a separate computer. Those display frames are then run through a computer vision-based object detection algorithm like You Only Look Once (YOLO) that has been trained to find human-shaped enemies in the image (or at least in a small central portion of the image near the targeting reticle).
While Amazon continues to expand its self-service, computer-vision-based grocery checkout technology by bringing it to bigger stores, an AI startup out of Israel that’s built something to rival it has picked up funding and a new strategic investor as a customer.
Trigo, which has produced a computer vision system that includes both camera hardware and encrypted, privacy-compliant software to enable “grab and go” shopping — where customers can pick up items that get automatically detected and billed before they leave the store — has bagged $10 million in funding from German supermarket chain REWE Group and Viola Growth.
The exact amount of the investment was not being disclosed (perhaps because $10 million, in these crazy times, suddenly sounds like a modest amount?), but Pitchbook notes that Trigo had up to now raised $87 million, and Trigo has confirmed that it has now raised “over $100 million,” including a Series A in 2019, and a Series B of $60 million that it raised in December of last year. The company has confirmed that the amount raised is $10 million today, and $104 million in total.
The company is not disclosing its valuation. We have asked and will update as we learn more.
“Trigo is immensely proud and honored to be deepening its strategic partnership with REWE Group, one of Europe’s biggest and most innovative grocery retailers,” said Michael Gabay, Trigo co-founder and CEO, in a statement. “REWE have placed their trust in Trigo’s privacy-by-design architecture, and we look forward to bringing this exciting technology to German grocery shoppers. We are also looking forward to working with Viola Growth, an iconic investment firm backing some of Israel’s top startups.”
The REWE investment is part of a bigger partnership between the two companies, which will begin with a new “grab and go” REWE store in Cologne. REWE has 3,700 stores across Germany, so there is a lot of scope there for expansion. REWE is Trigo’s second strategic investor: Tesco has also backed the startup and has been trialling its technology in the U.K.. Trigo’s also being used by Shufersal, a grocery chain in Israel.
REWE’s investment comes amid a spate of tech engagements by the grocery giant, which recently also announced a partnership with Flink, a new grocery delivery startup out of Germany that recently raised a big round of funding to expand. It’s also working with Yamo, a healthy eating startup; and Whisk, an AI powered buy-to-cook startup.
“With today’s rapid technological developments, it is crucial to find the right partners,” said Christoph Eltze, Executive Board Member Digital, Customer & Analytics REWE Group. “REWE Group is investing in its strategic partnership with Trigo, who we believe is one of the leading companies in computer vision technologies for smart stores.”
More generally, consumer habits are changing, fast. Whether we are talking about the average family, or the average individual, people are simply not shopping, cooking and eating in the same way that they were even 10 years ago, let alone 20 or 30 years ago.
And so like many others in the very established brick-and-mortar grocery business, REWE — founded in 1927 — is hoping to tie up with some of the more interesting innovators to better keep ahead in the game.
“I don’t actually think people really want grocery e-commerce,” Ran Peled, Trigo’s VP of marketing, told me back in 2019. “They do that because the supermarket experience has become worse with the years. We are very much committed to helping brick and mortar stores return to the time of a few decades ago, when it was fun to go to the supermarket. What would happen if a store could have an entirely new OS that is based on computer vision?”
It will be interesting to see how widely used and “fun” smart checkout services will become in that context, and whether it will be a winner-takes-all market, or whether we’ll see a proliferation of others emerge to provide similar tools.
In addition to Amazon and Trigo, there is also Standard Cognition, which earlier this year raised money at a $1 billion valuation, among others and other approaches. One thing that more competition could mean is also more competitive pricing for systems that otherwise could prove costly to implement and run except for in the busiest locations.
There is also a bigger question over what the optimal size will be for cashierless, grab-and-go technology. Trigo cites data from Juniper Research that forecasts $400 billion in smart checkout transactions annually by 2025, but it seems that the focus in that market will likely be, in Juniper’s view, on smaller grocery and convenience stores rather than the cavernous cathedrals to consumerism that many of these chains operate. In that category, the market size is 500,000 stores globally, 120,000 of them in Europe.
Start-ups are using technology to take a robotic approach to manicures, offering a simple way to provide foolproof nail polish.
eYs3D Microelectronics, a fabless design house that focuses on end-to-end software and hardware systems for computer vision technology, has raised a $7 million Series A. Participants included ARM IoT Capital, WI Harper and Marubun Corporation, who will each serve as strategic investors.
Based in Taipei, Taiwan, eYs3D was spun out of Etron, a fabless IC and system-in-package (SiP) design firm, in 2016. It will use its new funding to build its embedded chip business in new markets. The company’s technology, including integrated circuits, 3D sensors, camera modules and AI-based software, have a wide range of applications, such as robotics, touchless controls, autonomous vehicles and smart retail. eYs3D’s products have been used in the Facebook Oculus Rift S and Valve Index virtual reality headsets, and Techman Robots.
ARM, the microprocessor company, will integrate eYs3D’s chips into its CPU and NPUs. WI Harper, a cross-border investment firm with offices in Taipei, Beijing and San Francisco, will give eYs3D access to its international network of industrial partners. Marubun Corporation, a Japan-based company that distributes semiconductors and other electronic components, will open new distribution channels for eYs3D.
In a press statement, ARM IoT Capital chairman Peter Hsieh said, “As we look to the future, enhanced computer vision support plays a key role in ARM’s AI architecture and deployment. eYs3D’s innovative 3D computer vision capability can offer the market major benefits, and we are pleased to partner with the company and invest in the creation of more AI-capable vision processors.”
The new funding will also be used to expand eYs3D’s product development and launch a series of 3D computer vision modules. It will also work with new business partners to expand its platform and hire more talent.
eYs3D’s chief strategy officer James Wang told TechCrunch that the global chip shortage and Taiwan’s drought haven’t significantly impacted the company’s business or production plans, because it works with Etron as its integrated circuits manufacturing service.
“Etron Technology is one of the major accounts for the Taiwanese foundry sector and has strong relationships with the foundries, so eYs3D can receive products for its customers as required,” he said. “Meanwhile, eYs3D works closly with its major customers to schedule a just-in-time supply chain for their production pipelines.”
The company’s systems combine silicon design and algorithms to manage information collected from different sensor sources, including thermal, 3D and neural network perception. Its technology is capable of supporting visual simultaneous location and mapping (vSLAM), object feature depth recognition, and gesture-based commands.
Yang said eYs3D can provide end-to-end services, from integrated circuit design to ready-to-use products, and works closely with clients to determine what they need. For example, it offered its chip solution to an autonomous robot company for obstacle avoidance and people-tracing features.
“Since their expertise is in robotic motor controls and mechanicals, they needed a more complete solution for a design module for 3D sensing, as well as object and people recognition. We provided them with one of our 3D depth camera solutions and SDK along with middleware algorithm samples for their validation,” said Yang. “The customer took our design package and seamlessly integrated our 3D depth camera solution for proof-of-concept within a short period of time. Next, we helped them to retrofit the camera design to fit in their robot body prior to commercialization of the robot.”
Snap yesterday announced the latest iteration of its Spectacles augmented reality glasses, and today the company revealed a bit more news: it is also acquiring the startup that supplied the technology that helps power them. The Snapchat parent is snapping up WaveOptics, an AR startup that makes the waveguides and projectors used in AR glasses. These overlay virtual images on top of the views of the real world someone wearing the glasses can see, and Snap worked with WaveOptics to build its latest version of Spectacles.
The deal was first reported by The Verge, and a spokesperson for Snap directly confirmed the details to TechCrunch. Snap is paying over $500 million for the startup, in a cash-and-stock deal. The first half of that will be coming in the form of stock when the deal officially closes, and the remainder will be payable in cash or stock in two years.
This is a big leap for WaveOptics, which had raised around $65 million in funding from investors that included Bosch, Octopus Ventures and a host of individuals, from Stan Boland (veteran entrepreneur in the UK, most recently at FiveAI) and Ambarish Mitra (the co-founder of early AR startup Blippar). PitchBook estimates that its most recent valuation was only around $105 million.
WaveOptics was founded in Oxford, and it’s not clear where the team will be based after the deal is closed — we have asked.
We have been covering the company since its earliest days, when it displayed some very interesting, early, and ahead-of-its-time technology: waveguides based on hologram physics and photonic crystals. The important and key thing is that its tech drastically compresses size and load of the hardware needed to process and display images, meaning a much wider and more flexible range of form factors for AR hardware based on WaveOptics tech.
It’s not clear whether WaveOptics will continue to work with other parties post-deal, but it seems that one obvious advantage for Snap would be making the startup’s technology exclusive to itself.
Snap has been on something of an acquisition march in recent times — it’s made at least three other purchases of startups since January, including Fit Analytics for an AR-fuelled move into e-commerce, as well as Pixel8Earth and StreetCred for its mapping tools.
This deal, however, marks Snap’s biggest acquisition to date in terms of valuation. That is not only a mark of the premium price that foundational artificial intelligence tech continues to command — in addition to the team of scientists that built WaveOptics, it also has 12 filed and in-progress patents — but also Snap’s financial and, frankly, existential commitment to having a seat at the table when it comes not just to social apps that use AR, but hardware, and being at the centre of not just using the tech, but setting the pace and agenda for how and where that will play out.
That’s been a tenacious and not always rewarding place for it to be, but the company — which has long described itself as a “camera company” — has kept hardware in the mix as an essential component for its future strategy.
Machine learning is capable of doing all sorts of things as long as you have the data to teach it how. That’s not always easy, and researchers are always looking for a way to add a bit of “common sense” to AI so you don’t have to show it 500 pictures of a cat before it gets it. Facebook’s newest research takes a big step towards reducing the data bottleneck.
The company’s formidable AI research division has been working on how to advance and scale things like advanced computer vision algorithms for years now, and has made steady progress, generally shared with the rest of the research community. One interesting development Facebook has pursued in particular is what’s called “semi-supervised learning.”
Generally when you think of training an AI, you think of something like the aforementioned 500 pictures of cats — images that have been selected and labeled (which can mean outlining the cat, putting a box around the cat, or just saying there’s a cat in there somewhere) so that the machine learning system can put together an algorithm to automate the process of cat recognition. Naturally if you want to do dogs or horses, you need 500 dog pictures, 500 horse pictures, etc — it scales linearly, which is a word you never want to see in tech.
Semi-supervised learning, related to “unsupervised” learning, involves figuring out important parts of a dataset without any labeled data at all. It doesn’t just go wild, there’s still structure; for instance, imagine you give the system a thousand sentences to study, then showed it ten more that have several of the words missing. The system could probably do a decent job filling in the blanks just based on what it’s seen in the previous thousand. But that’s not so easy to do with images and video — they aren’t as straightforward or predictable.
But Facebook researchers have shown that while it may not be easy, it’s possible and in fact very effective. The DINO system (which stands rather unconvincingly for “DIstillation of knowledge with NO labels”) is capable of learning to find objects of interest in videos of people, animals, and objects quite well without any labeled data whatsoever.
It does this by considering the video not as a sequence of images to be analyzed one by one in order, but as an complex, interrelated set,like the difference between “a series of words” and “a sentence.” By attending to the middle and the end of the video as well as the beginning, the agent can get a sense of things like “an object with this general shape goes from left to right.” That information feeds into other knowledge, like when an object on the right overlaps with the first one, the system knows they’re not the same thing, just touching in those frames. And that knowledge in turn can be applied to other situations. In other words, it develops a basic sense of visual meaning, and does so with remarkably little training on new objects.
This results in a computer vision system that’s not only effective — it performs well compared with traditionally trained systems — but more relatable and explainable. For instance, while an AI that has been trained with 500 dog pictures and 500 cat pictures will recognize both, it won’t really have any idea that they’re similar in any way. But DINO — although it couldn’t be specific — gets that they’re similar visually to one another, more so anyway than they are to cars, and that metadata and context is visible in its memory. Dogs and cats are “closer” in its sort of digital cognitive space than dogs and mountains. You can see those concepts as little blobs here — see how those of a type stick together:
This has its own benefits, of a technical sort we won’t get into here. If you’re curious, there’s more detail in the papers linked in Facebook’s blog post.
There’s also an adjacent research project, a training method called PAWS, which further reduces the need for labeled data. PAWS combines some of the ideas of semi-supervised learning with the more traditional supervised method, essentially giving the training a boost by letting it learn from both the labeled and unlabeled data.
Facebook of course needs good and fast image analysis for its many user-facing (and secret) image-related products, but these general advances to the computer vision world will no doubt be welcomed by the developer community for other purposes.
Sign language is used by millions of people around the world, but unlike Spanish, Mandarin or even Latin, there’s no automatic translation available for those who can’t use it. SLAIT claims the first such tool available for general use, which can translate around 200 words and simple sentences to start — using nothing but an ordinary computer and webcam.
People with hearing impairments, or other conditions that make vocal speech difficult, number in the hundreds of millions, rely on the same common tech tools as the hearing population. But while emails and text chat are useful and of course very common now, they aren’t a replacement for face-to-face communication, and unfortunately there’s no easy way for signing to be turned into written or spoken words, so this remains a significant barrier.
We’ve seen attempts at automatic sign language (usually American/ASL) translation for years and years: in 2012 Microsoft awarded its Imagine Cup to a student team that tracked hand movements with gloves; in 2018 I wrote about SignAll, which has been working on a sign language translation booth using multiple cameras to give 3D positioning; and in 2019 I noted that a new hand-tracking algorithm called MediaPipe, from Google’s AI labs, could lead to advances in sign detection. Turns out that’s more or less exactly what happened.
SLAIT is a startup built out of research done at the Aachen University of Applied Sciences in Germany, where co-founder Antonio Domènech built a small ASL recognition engine using MediaPipe and custom neural networks. Having proved the basic notion, Domènech was joined by co-founders Evgeny Fomin and William Vicars to start the company; they then moved on to building a system that could recognize first 100, and now 200 individual ASL gestures and some simple sentences. The translation occurs offline, and in near real time on any relatively recent phone or computer.
They plan to make it available for educational and development work, expanding their dataset so they can improve the model before attempting any more significant consumer applications.
Of course, the development of the current model was not at all simple, though it was achieved in remarkably little time by a small team. MediaPipe offered an effective, open-source method for tracking hand and finger positions, sure, but the crucial component for any strong machine learning model is data, in this case video data (since it would be interpreting video) of ASL in use — and there simply isn’t a lot of that available.
As they recently explained in a presentation for the DeafIT conference, the first team evaluated using an older Microsoft database, but found that a newer Australian academic database had more and better quality data, allowing for the creation of a model that is 92 percent accurate at identifying any of 200 signs in real time. They have augmented this with sign language videos from social media (with permission, of course) and government speeches that have sign language interpreters — but they still need more.
Their intention is to make the platform available to the deaf and ASL learner communities, who hopefully won’t mind their use of the system being turned to its improvement.
And naturally it could prove an invaluable tool in its present state, since the company’s translation model, even as a work in progress, is still potentially transformative for many people. With the amount of video calls going on these days and likely for the rest of eternity, accessibility is being left behind — only some platforms offer automatic captioning, transcription, summaries, and certainly none recognize sign language. But with SLAIT’s tool people could sign normally and participate in a video call naturally rather than using the neglected chat function.
“In the short term, we’ve proven that 200 word models are accessible and our results are getting better every day,” said SLAIT’s Evgeny Fomin. “In the medium term, we plan to release a consumer facing app to track sign language. However, there is a lot of work to do to reach a comprehensive library of all sign language gestures. We are committed to making this future state a reality. Our mission is to radically improve accessibility for the Deaf and hard of hearing communities.”
He cautioned that it will not be totally complete — just as translation and transcription in or to any language is only an approximation, the point is to provide practical results for millions of people, and a few hundred words goes a long way toward doing so. As data pours in, new words can be added to the vocabulary, and new multi-gesture phrases as well, and performance for the core set will improve.
Right now the company is seeking initial funding to get its prototype out and grow the team beyond the founding crew. Fomin said they have received some interest but want to make sure they connect with an investor who really understands the plan and vision.
When the engine itself has been built up to be more reliable by the addition of more data and the refining of the machine learning models, the team will look into further development and integration of the app with other products and services. For now the product is more of a proof of concept, but what a proof it is — with a bit more work SLAIT will have leapfrogged the industry and provided something that deaf and hearing people both have been wanting for decades.
Tel Aviv’s Orca AI, a computer vision startup that can be retrofitted to cargo ships and improve navigation and collision avoidance, has raised $13 million in a Series A funding, taking its total raised to over $15.5 million. While most cargo ships carry security cameras, computer vision cameras are rare. Orca AI hopes its solution could introduce autonomous guidance to vessels already at sea.
There are over 4,000 annual marine incidents, largely due to human error. The company says this is getting worse as the Coronavirus pandemic makes it harder for regular crew changes. The recent events in the Suez Canal have highlighted how crucial this industry is.
The funding round was led by OCV Partners, with Principal Zohar Loshitzer joining Orca AI’s board. Mizmaa Ventures and Playfair Capital also featured.
The company was founded by naval technology experts, Yarden Gross and Dor Raviv. The latter is an former Israel navy computer vision expert. Customers include Kirby, Ray Car Carriers and NYK.
Orca AI’s AI-based navigation and vessel tracking system supports ships in difficult to tricky to navigate situations and congested waterways, using vision sensors, thermal and low light cameras, plus algorithms that look at the environment and alert crews to dangerous situations.
On the raise, Yarden Gross, CEO, and co-founder said: “The maritime industry… is still far behind aviation with technological innovations. Ships deal with increasingly congested waterways, severe weather and low-visibility conditions creating difficult navigation experiences with often expensive cargo… Our solution provides unique insight and data to any ship in the world, helping to reduce these challenging situations and collisions in the future.”
Zohar Loshitzer, Principal from OCV added: “Commercial shipping has historically been a highly regulated and traditional industry. However, we are now “witnessing a positive change in the adoption of tech solutions to increase safety and efficiency.
Joy Buolamwini is on a crusade against bias in facial recognition technology, and the powerful companies that profit from it.
COVID-19 forced many retailers and brands to adopt new technologies. Retail analytics unicorn Trax expects that this openness to tech innovation will continue even after the pandemic. The Singapore-based company announced today that it has raised $640 million in Series E funding to expand its products, which combine computer vision and cloud-based software to help brick-and-mortar stores manage their inventory, merchandising and operations. The round included primary and secondary capital, and was led by SoftBank Vision Fund 2 and returning investor BlackRock. Other participants included new investors OMERS and Sony Innovation Fund by IGV.
Before this round, Trax had raised $360 million in primary funds. J.P. Morgan acted as a placement agent to Trax on its Series E, which brings its total funding so far to $1.02 billion. Trax did not disclose a new valuation, but reportedly hit unicorn status in 2019. Reports emerged last year that it is considering a public offering, but chief executive officer Justin Behar had no comment when asked by TechCrunch if Trax is planning for an IPO.
Founded in 2010 and headquartered in Singapore, Trax also has offices in Brazil, the United States, China, the United Kingdom, Israel, Mexico, Japan, Hungary, France, Russia and Australia. The company says it serves customers in more than 90 countries.
Behar told TechCrunch that the new funding will be used to “invest heavily in global [go-to-market] strategies and technology for our flagship Retail Watch solution, as we look for ways to make it easier for retailers and brands to continue their digitization journey. More specifically, we will use the capital to accelerate growth and triple-down on continued innovation across our core vision, machine learning, IoT and marketplace technologies.”
Launched last year, Retail Watch uses a combination of computer vision, machine learning and hardware like cameras and autonomous robots, to gather real-time data about the shelf availability of products. It sends alerts if stock is running low, corrects pricing errors and checks if planograms, or product display plans for visual merchandising, are being followed. Retail Watch currently focuses on center shelves, where packaged goods are usually stocked, but will expand into categories like fresh food and produce.
The funding will also be used to expand Trax’s Dynamic Merchandising, a partnership with on-demand work platform Flexforce, and Shopkick, the shopping rewards app Trax acquired in 2019, into new markets over the next one to two years.
“Finally, we see many opportunities to help retailers along their digitization journey and will be expanding into new use cases with products we develop internally and via potential acquisitions,” Behar said.
Early in the pandemic, retailers had to cope with surge buying, as customers emptied shelves of stock while preparing to stay at home. As the pandemic continued, buying patterns shifted dramatically and in April 2020, Forrester forecast COVID-19 would cause global retail sales to decline by an average of 9.6% globally, resulting in a loss of $2.1 trillion, and that it would take about four years for retailers to overtake pre-pandemic levels.
In a more recent report, Forrester found despite spending cuts, nearly 40% of retailers and wholesalers immediately increased their tech investment, in some cases implementing projects in weeks that would have otherwise taken years.
Behar said “the pandemic made it clear the retail industry was not prepared for a sudden change in demand, as consumers faced empty shelves and out-of-stocks for extended periods in key categories. These extreme shifts in consumer behavior, coupled with global supply chain disruptions, labor shortages, changing channel dynamics (such as e-commerce) and decrease in brand loyalty forced brands and retailers to develop new strategies to meet the evolving needs of their customers.”
He expects that willingness to adopt new technologies will continue after the pandemic. For example, to get shoppers back into brick-and-mortar stores, retailers might try things like in-store navigation, improved browsing, loyalty programs and new check out and payment systems.
Trax’s Retail Watch, Dynamic Merchandising and Dynamic Workforce Management solutions were in development before the pandemic, though “it has certainly expedited the need for innovative digital solutions to longstanding retail pain points,” Behar added.
For example, Retail Watch supports online ordering features, like showing what products are available to online shoppers and helping store associates fulfill orders, while Dynamic Merchandising lets brands find on-demand workers for in-store execution issues—for example, if new stock needs to be delivered to a location immediately.
Other tech companies focused on retail analytics include Quant Retail, Pensa Systems and Bossa Nova Robotics. Behar said Trax differentiates with a cloud-based platform that is “extensible, flexible and scalable and combines multiple integrated technologies and data-collection methods, optimized to fit each store, such as IoT-enabled shelf-edge cameras, dome cameras, autonomous robots and images taken from smartphones, to enable complete and accurate store coverage.”
Its proprietary computer vision technology was also designed specifically for use in retail stores, and identifies individual SKUs on shelves, regardless of category. For example, Behar said it can distinguish between near identical or multiple products, deal with visual obstructions like odd angles or products that are obscured by another item and recognize issues with price tags.
“Like many innovative solutions, our most meaningful competition comes from the legacy systems deeply entrenched in the world of retail and the fear of change,” he added. “While we do see an acceleration of interest and adoption of digital innovation as a result of the ‘COVID effect,’ this is by far our biggest challenge.”
In a press statement, SoftBank Investment Advisers director Chris Lee said, “Through its innovative AI platform and image recognition technologies, we believe Trax is optimizing retail stores by enabling [consumer packaged goods] brands and retailers to execute better inventory strategies using data and analytics. We are excited to partner with the Trax team to help expand their product offerings and enter new markets.”
Hong Kong-based viAct helps construction sites perform around-the-clock monitoring with an AI-based cloud platform that combines computer vision, edge devices and a mobile app. The startup announced today it has raised a $2 million seed round, co-led by SOSV and Vectr Ventures. The funding included participation from Alibaba Hong Kong Entrepreneurs Fund, Artesian Ventures and ParticleX.
Founded in 2016, viAct currently serves more than 30 construction industry clients in Asia and Europe. Its new funding will be used on research and development, product development and expanding into Southeast Asian countries.
The platform uses computer vision to detect potential safety hazards, construction progress and the location of machinery and materials. Real-time alerts are sent to a mobile app with a simple interface, designed for engineers who are often “working in a noisy and dynamic environment that makes it hard to look at detailed dashboards,” co-founder and chief operating officer Hugo Cheuk told TechCrunch.
As companies signed up for viAct to monitor sites while complying with COVID-19 social distancing measures, the company provided training over Zoom to help teams onboard more quickly.
Cheuk said the company’s initial markets in Southeast Asia will include Indonesia and Vietnam because government planning for smart cities and new infrastructure means new construction projects there will increase over the next five to 10 years. It will also enter Singapore because developers are willing to adopt AI-based technology.
In a press statement, SOSV partner and Chinaccelerator managing director Oscar Ramos said, “COVID has accelerated digital transformation and traditional industries like construction are going through an even faster process of transformation that is critical for survival. The viAct team has not only created a product that drives value for the industry but has also been able to earn the trust of their customers and accelerate adoption.”
An artificial retina would be an enormous boon to the many people with visual impairments, and the possibility is creeping closer to reality year by year. One of the latest advancements takes a different and very promising approach, using tiny dots that convert light to electricity, and virtual reality has helped show that it could be a viable path forward.
These photovoltaic retinal prostheses come from the École polytechnique fédérale de Lausanne, where Diego Ghezzi has been working on the idea for several years now.
Early retinal prosthetics were created decades ago, and the basic idea is as follows. A camera outside the body (on a pair of glasses, for instance) sends a signal over a wire to a tiny microelectrode array, which consists of many tiny electrodes that pierce the non-functioning retinal surface and stimulate the working cells directly.
The problems with this are mainly that powering and sending data to the array requires a wire running from outside the eye in — generally speaking a “don’t” when it comes to prosthetics, and the body in general. The array itself is also limited in the number of electrodes it can have by the size of each, meaning for many years the effective resolution in the best case scenario was on the order of a few dozen or hundred “pixels.” (The concept doesn’t translate directly because of the way the visual system works.)
Ghezzi’s approach obviates both these problems with the use of photovoltaic materials, which turn light into an electric current. It’s not so different from what happens in a digital camera, except instead of recording the charge as in image, it sends the current into the retina like the powered electrodes did. There’s no need for a wire to relay power or data to the implant, because both are provided by the light shining on it.
In the case of the EPFL prosthesis, there are thousands of tiny photovoltaic dots, which would in theory be illuminated by a device outside the eye sending light in according to what it detects from a camera. Of course, it’s still an incredibly difficult thing to engineer. The other part of the setup would be a pair of glasses or goggles that both capture an image and project it through the eye onto the implant.
We first heard of this approach back in 2018, and things have changed somewhat since then, as a new paper documents.
“We increased the number of pixels from about 2,300 to 10,500,” explained Ghezzi in an email to TechCrunch. “So now it is difficult to see them individually and they look like a continuous film.”
Of course when those dots are pressed right up against the retina it’s a different story. After all, that’s only 100×100 pixels or so if it were a square — not exactly high definition. But the idea isn’t to replicate human vision, which may be an impossible task to begin with, let alone realistic for anyone’s first shot.
“Technically it is possible to make pixel smaller and denser,” Ghezzi explained. “The problem is that the current generated decreases with the pixel area.”
So the more you add, the tougher it is to make it work, and there’s also the risk (which they tested) that two adjacent dots will stimulate the same network in the retina. But too few and the image created may not be intelligible to the user. 10,500 sounds like a lot, and it may be enough — but the simple fact is that there’s no data to support that. To start on that the team turned to what may seem like an unlikely medium: VR.
Because the team can’t exactly do a “test” installation of an experimental retinal implant on people to see if it works, they needed another way to tell whether the dimensions and resolution of the device would be sufficient for certain everyday tasks like recognizing objects and letters.
To do this, they put people in VR environments that were dark except for little simulated “phosphors,” the pinpricks of light they expect to create by stimulating the retina via the implant; Ghezzi likened what people would see to a constellation of bright, shifting stars. They varied the number of phosphors, the area they appear over, and the length of their illumination or “tail” when the image shifted, asking participants how well they could perceive things like a word or scene.
Their primary finding was that the most important factor was visual angle — the overall size of the area where the image appears. Even a clear image is difficult to understand if it only takes up the very center of your vision, so even if overall clarity suffers it’s better to have a wide field of vision. The robust analysis of the visual system in the brain intuits things like edges and motion even from sparse inputs.
This demonstration showed that the implant’s parameters are theoretically sound and the team can start working towards human trials. That’s not something that can happen in a hurry, and while this approach is very promising compared with earlier, wired ones, it will still be several years even in the best case scenario before it’s possible it could be made widely available. Still, the very prospect of a working retinal implant of this type is an exciting one and we’ll be following it closely.
When Google forced out two well-known artificial intelligence experts, a long-simmering research controversy burst into the open.
SuperAnnotate, a NoCode computer vision platform, is partnering with OpenCV, a non-profit organization that has built a large collection of open-source computer vision algorithms. The move means startups and entrepreneurs will be able to build their own AI models and allow cameras to detect objects using machine learning. SuperAnnotate has so far raised $3M to date from investors including Point Nine Capital, Fathom Capital and Berkeley SkyDeck Fund.
The AI-powered computer vision platform for data scientists and annotation teams will provide OpenCV AI Kit (OAK) users with access to its platform, as well as launching a computer vision course on building AI models. SuperAnnotate will also set up the AI Kit’s camera to detect objects using machine learning and OAK users will get $200 of credit to set up their systems on its platform.
The OAK is a multi-camera device that can run computer vision and 3D perception tasks such as identifying objects, counting people and measuring distances. Since launching, around 11,000 of these cameras have been distributed.
The AI Kit has so far been used to build drone and security applications, agricultural vision sensors or even COVID-related detection devices (for example, to identify people whether someone is wearing a mask or not).
Tigran Petrosyan, co-founder and CEO at SuperAnnotate said in a statement that: “Computer vision and smart camera applications are gaining momentum, yet not many have the relevant AI expertise to implement those. With OAK Kit and SuperAnnotate, one can finally build their smart camera system, even without coding experience.”
Competitors to SuperAnnotate include Dataloop, Labelbox, Appen and Hive .
Last fall, Amazon introduced a new biometric device, Amazon One, that allowed customers to pay at Amazon Go stores using their palm. Today, the company says the device is being rolled out to additional Amazon stores in Seattle — an expansion that will make the system available across eight total Amazon physical retail stores, including Amazon Go convenience stores, Amazon Go Grocery, Amazon Books, and Amazon 4-star stores.
Starting today, the Amazon One system is being added as an entry option at the Amazon Go location at Madison & Minor in Seattle. In the next few weeks, it will also roll out to two more Amazon Go stores, at 5th & Marion and Terry & Stewart, the company says. That brings the system to eight Seattle locations, and sets the stage for a broader U.S. expansion in the months ahead.
As described, the Amazon One system uses computer vision technology to create a unique palm print for each customer, which Amazon then associates with the credit card the customer inserts upon initial setup. While the customer doesn’t have to have an Amazon account to use the service, if they do associate their account information, they’ll be able to see their shopping history on the Amazon website.
Amazon says images of the palm print are encrypted and secured in the cloud, where customers’ palm signatures are created. At the time of its initial launch, Amazon argued that palm prints were a more private form of biometric authentication than some other methods, because you can’t determine a customer’s identity based only on the image of their palm.
But Amazon isn’t just storing palm images, of course. It’s matching them to customer accounts and credit cards, effectively building a database of customer biometrics. It can also then use the data collected, like shopping history, to introduce personalized offers and recommendations over time.
What’s more is the company doesn’t just envision Amazon One as a means of entry into its own stores — they’re just a test market. In time, Amazon wants to make the technology available to third-parties, as well, including stadiums, office buildings and other non-Amazon retailers.
The timing of the Amazon One launch in the middle of a pandemic has helped spur customer adoption, as it allows for a contactless way to associate your credit card with your future purchases. Upon subsequent re-entry, you just hold your hand above the reader to be scanned again and let into the store.
These systems, however, can disadvantage a lower-socioeconomic group of customers, who prefer to pay using cash. They have to wait for special assistance in these otherwise cashless, checkout-free stores.
Amazon says the system will continue to roll out to more locations in the future.
An online tool targets only a small slice of what’s out there, but may open some eyes to how widely artificial intelligence research fed on personal images.
In recent years we’ve seen a whole bunch of visual/style fashion-focused search engines cropping up, tailored to helping people find the perfect threads to buy online by applying computer vision and other AI technologies to perform smarter-than-keywords visual search which can easily match and surface specific shapes and styles. Startups like Donde Search, Glisten and Stye.ai to name a few.
Early stage London-based Cadeera, which is in the midst of raising a seed round, wants to apply a similar AI visual search approach but for interior decor. All through the pandemic it’s been working on a prototype with the aim of making ecommerce discovery of taste-driven items like sofas, armchairs and coffee tables a whole lot more inspirational.
Founder and CEO Sebastian Spiegler, an early (former) SwiftKey employee with a PhD in machine learning and natural language processing, walked TechCrunch through a demo of the current prototype.
The software offers a multi-step UX geared towards first identifying a person’s decor style preferences — which it does by getting them to give a verdict on a number of look book images of rooms staged in different interior decor styles (via a Tinder-style swipe left or right).
It then uses these taste signals to start suggesting specific items to buy (e.g. armchairs, sofas etc) that fit the styles they’ve liked. The user can continue to influence selections by asking to see other similar items (‘more like this’), or see less similar items to broaden the range of stuff they’re shown — injecting a little serendipity into their search.
The platform also lets users search by uploading an image — with Cadeera then parsing its database to surface similar looking items which are available for sale.
It has an AR component on its product map, too — which will eventually also let users visualize a potential purchase in situ in their home. Voice search will also be supported.
“Keyword search is fundamentally broken,” argues Spiegler. “Image you’re refurbishing or renovating your home and you say I’m looking for something, I’ve seen it somewhere, I only know when I see it, and I don’t really know what I want yet — so the [challenge we’re addressing is this] whole process of figuring out what you want.”
“The mission is understanding personal preferences. If you don’t know yourself what you’re looking for we’re basically helping you with visual clues and with personalization and with inspiration pieces — which can be content, images and then at some point community as well — to figure out what you want. And for the retailer it helps them to understand what their clients want.”
“It increases trust, you’re more sure about your purchases, you’re less likely to return something — which is a huge cost to retailers. And, at the same time, you may also buy more because you more easily find things you can buy,” he adds.
Ecommerce has had a massive boost from the pandemic which continues to drive shopping online. But the flip side of that is bricks-and-mortar retailers have been hit hard.
The situation may be especially difficult for furniture retailers that may well have been operating showrooms before COVID-19 — relying upon customers being able to browse in-person to drive discovery and sales — so they are likely to be looking for smart tools that can help them transition to and/or increase online sales.
And sector-specific visual search engines do seem likely to see uplift as part of the wider pandemic-driven ecommerce shift.
“The reason why I want to start with interior design/home decor and furniture is that it’s a clearly underserved market. There’s no-one out there, in my view, that has cracked the way to search and find things more easily,” Spiegler tells TechCrunch. “In fashion there are quite a few companies out there. And I feel like we can master furniture and home decor and then move into other sectors. But for me the opportunity is here.”
“We can take a lot of the ideas from the fashion sector and apply it to furniture,” he adds. “I feel like there’s a huge gap — and no-one has looked at it sufficiently.”
The size of the opportunity Cadeera is targeting is a $10BN-$20BN market globally, per Spiegler.
The startup’s initial business model is b2b — with the plan being to start selling its SaaS to ecommerce retailers to integrate the visual search tools directly into their own websites.
Spiegler says they’re working with a “big” UK-based vintage platform — and aiming to get something launched to the market within the next six to nine months with one to two customers.
They will also — as a next order of business — offer apps for ecommerce platforms such as WooCommerce, BigCommerce and Shopify to integrate a set of their search tools. (Larger retailers will get more customization of the platform, though.)
On the question of whether Cadeera might develop a b2c offer by launching a direct consumer app itself, Spiegler admits that is an “end goal”.
“This is the million dollar question — my end-goal, my target is building a consumer app. Building a central place where all your shopping preferences are stored — kind of a mix of Instagram where you see inspiration and Pinterest where you can keep what you looked at and then get relevant recommendations,” he says.
“This is basically the idea of a product search engine we want to build. But what I’m showing you are the steps to get there… and we hopefully end in the place where we have a community, we have a b2c app. But the way I look at it is we start through b2b and then at some point switch the direction and open it up by providing a single entry point for the consumer.”
But, for now, the b2b route means Cadeera can work closely with retailers in the meanwhile — increasing its understanding of retail market dynamics and getting access to key data needed power its platform, such as style look books and item databases.
“What we end up with is a large inventory data-set/database, a design knowledge base and imagery and style meta information. And on top of that we do object detection, object recognition, recommendation, so the whole shebang in AI — for the purpose of personalization, exploration, search and suggestion/recommendation,” he goes on, sketching the various tech components involved.
“On the other side we provide an API so you can integrate into use as well. And if you need we can also provide with a responsive UX/UI.”
“Beyond all of that we are creating an interesting data asset where we understand what the user wants — so we have user profiles, and in the future those user profiles can be cross-platform. So if you purchase something at one ecommerce site or one retailer you can then go to another retailer and we can make relevant recommendations based on what you purchased somewhere else,” he adds. “So your whole purchasing history, your style preferences and interaction data will allow you to get the most relevant recommendations.”
While the usual tech giant suspects still dominate general markets for search (Google) and ecommerce (Amazon), Cadeera isn’t concerned about competition from the biggest global platforms — given they are not focused on tailoring tools for a specific furniture/home decor niche.
He also points out that Amazon continues to do a very poor job on recommendations on its own site, despite having heaps of data.
“I’ve been asking — and I’ve been asked as well — so many times why is Amazon doing such a poor job on recommendations and in search. The true answer is I don’t know! They have probably the best data set… but the recommendations are poor,” he says. “What we’re doing here is trying to reinvent a whole product. Search should work… and the inspiration part, for things that are more opaque, is something important that is missing with anything I’ve seen so far.”
And while Facebook did acquire a home decor-focused visual search service (called GrokStyle) back in 2019, Spiegler suggests it’s most likely to integrate their tech (which included AR for visualization) into its own marketplace — whereas he’s convinced most retailers will want to be able to remain independent of the Facebook walled garden.
“GrokStyle will become part of Facebook marketplace but if you’re a retailer the big question is how much do you want to integrate into Facebook, how much do you want to be dependent on Facebook? And I think that’s a big question for a lot of retailers. Do you want to dependent on Google? Do you want to be dependent on Amazon? Do you want to be dependent on Facebook?” he says. “My guess is no. Because you basically want to stay as far away as possible because they’re going to eat up your lunch.”
Sports have been among some of the most popular and lucrative media plays in the world, luring broadcasters, advertisers and consumers to fork out huge sums to secure the chance to watch (and sponsor) their favorite teams and athletes.
That content, unsurprisingly, also typically costs a ton of money to produce, narrowing the production and distribution funnel even more. But today, a startup that’s cracked open that model with an autonomous, AI -based camera that lets any team record, edit and distribute their games, is announcing a round of funding to build out its business targeting the long tail of sporting teams and fixtures.
Veo Technologies, a Copenhagen startup that has designed a video camera and cloud-based subscription service to record and then automatically pick out highlights of games, which it then hosts on a platform for its customers to access and share that video content, has picked up €20 million (around $24.5 million) in a Series B round of funding.
The funding is being led by Danish investor Chr. Augustinus Fabrikker, with participation from US-based Courtside VC, France’s Ventech and Denmark’s SEED Capital. Veo’s CEO and co-founder Henrik Teisbæk said in an interview that the startup is not disclosing its valuation, but a source close to funding tells me that it’s well over $100 million.
Teisbæk said that the plan will be to use to the funds to continue expanding the company’s business on two levels. First, Veo will be digging into expanding its US operations, with an office in Miami.
Second, it plans to continue enhancing the scope of its technology: The company started out optimising its computer vision software to record and track the matches for the most popular team sport in the world, football (soccer to US readers), with customers buying the cameras — which retail for $800 — and the corresponding (mandatory) subscriptions — $1,200 annually — both to record games for spectators, as well as to use the footage for all kinds of practical purposes like training and recruitment videos. The key is that the cameras can be set up and left to run on their own. Once they are in place, they can record using wide-angles the majority of a soccer field (or whatever playing space is being used) and then zoom and edit down based on that.
Now, Veo is building the computer vision algorithms to expand that proposition into a plethora of other team-based sports including rugby, basketball and hockey, and it is ramping up the kinds of analytics that it can provide around the clips that it generates as well as the wider match itself.
Even with the slowdown in a lot of sporting activity this year due to Covid — in the UK for example, we’re in a lockdown again where team sports below professional leagues, excepting teams for disabled people, have been prohibited — Veo has seen a lot of growth.
The startup currently works with some 5,000 clubs globally ranging from professional sports teams through to amateur clubs for children, and it has recorded and tracked 200,000 games since opening for business in 2018, with a large proportion of that volume in the last year and in the US.
For a point of reference, in 2019, when we covered a $6 million round for Veo, the startup had racked up 1,000 clubs and 25,000 games, pointing to customer growth of 400% in that period.
The Covid-19 pandemic has indeed altered the playing field — literally and figuratively — for sports in the past year. Spectators, athletes, and supporting staff need to be just as mindful as anyone else when it comes to spreading the coronavirus.
That’s not just led to a change in how many games are being played, but also for attendance: witness the huge lengths that the NBA went to last year to create an extensive isolation bubble in Orlando, Florida, to play out the season, with no actual fans in physical seats watching games, but all games and fans virtually streamed into the events as they happened.
That NBA effort, needless to say, came at a huge financial cost, one that any lesser league would never be able to carry, and so that predicament has led to an interesting use case for Veo.
Pre-pandemic, the Danish startup was quietly building its business around catering to the long tail of sporting organizations who — even in the best of times — would be hard pressed to find the funds to buy cameras and/or hire videographers to record games, not just an essential part of how people can enjoy a sporting event, but useful for helping with team development.
“There is a perception that football is already being recorded and broadcast, but in the UK (for example) it’s only the Premier League,” Teisbæk said. “If you go down one or two steps from that, nothing is being recorded.” Before Veo, to record a football game, he added, “you need a guy sitting on a scaffold, and time and money to then cut that down to highlights. It’s just too cumbersome. But video is the best tool there is to develop talent. Kids are visual learners. And it’s a great way to get recruited sending videos to colleges.”
Those use cases then expanded with the pandemic, he said. “Under cornavirus rules, parents cannot go out and watch their kids, and so video becomes a tool to follow those matches.”
‘We’re a Shopify, not an Amazon’
The business model for Veo up to now has largely been around what Teisbæk described as “the long tail theory”, which in the case of sports works out, he said, as “There won’t be many viewers for each match, but there are millions of matches out there.” But if you consider how a lot of high school sports will attract locals beyond those currently attached to a school — you have alumni supporters and fans, as well as local businesses and neighborhoods — even that long tail audience might be bigger than one might imagine.
Veo’s long-tail focus has inevitably meant that its target users are in the wide array of amateur or semi-pro clubs and the people associated with them, but interestingly it has also spilled into big names, too.
Veo’s cameras are being used by professional soccer clubs in the Premier League, Spain’s La Liga, Italy’s Serie A and France’s Ligue 1, as well as several clubs in the MLS such as Inter Miami, Austin FC, Atlanta United and FC Cincinnati. Teisbæk noted that while this might never be for primary coverage, it’s there to supplement for training and also be used in the academies attached to those organizations.
The plan longer term, he said, is not to build its own media empire with trove of content that it has amassed, but to be an enabler for creating that content for its customers, who can in turn use it as they wish. It’s a “Shopify, not an Amazon,” said Teisbæk.
“We are not building the next ESPN, but we are helping the clubs unlock these connections that are already in place by way of our technology,” he said. “We want to help help them capture and stream their matches and their play for the audience that is there today.”
That may be how he views the opportunity, but some investors are already eyeing up the bigger picture.
Vasu Kulkarni, a partner at Courtside VC — a firm that has focused (as its name might imply) on backing a lot of different sports-related businesses, with The Athletic, Beam (acquired by Microsoft), and many others in its portfolio — said that he’d been looking to back a company like Veo, building a smart, tech-enabled way to record and parse sports in a more cost-effective way.
“I spent close to four years trying to find a company trying to do that,” he said.
“I’ve always been a believer in sports content captured at the long tail,” he said. Coincidentally, he himself started a company called Krossover in his dorm room to help somewhat with tracking and recording sports training. Krossover eventually was acquired by Hudl, which Veo sees as a competitor.
“You’ll never have the NBA finals recorded on Veo, there is just too much at stake, but when you start to look at all the areas where there isn’t enough mass media value to hire people, to produce and livestream, you get to the point where computer vision and AI are going to be doing the filming to get rid of the cost.”
He said that the economics are important here: the camera needs to be less than $1,000 (which it is) and produce something demonstrably better than “a parent with a Best Buy camcorder that was picked up for $100.”
Kulkarni thinks that longer term there could definitely be an opportunity to consider how to help clubs bring that content to a wider audience, especially using highlights and focusing on the best of the best in amateur games — which of course are the precursors to some of those players one day being world-famous elite athletes. (Think of how exciting it is to see the footage of Michael Jordan playing as a young student for some context here.) “AI will be able to pull out the best 10-15 plays and stitch them together for highlight reels,” he said, something that could feasibly find a market with sports fans wider than just the parents of the actual players.
All of that then feeds a bigger market for what has started to feel like an insatiable appetite for sports, one that, if anything, has found even more audience at a time when many are spending more time at home and watching video overall. “The more video you get from the sport, the better the sport gets, for players and fans,” Teisbæk said.
AWS has launched a new hardware device, the AWS Panorama Appliance, which, alongside the AWS Panorama SDK, will transform existing on-premises cameras into computer vision enabled super-powered surveillance devices.
Pitching the hardware as a new way for customers to inspect parts on manufacturing lines, ensure that safety protocols are being followed, or analyze traffic in retail stores, the new automation service is part of the theme of this AWS re:Invent event — automate everything.
Along with computer vision models that companies can develop using Amazon SageMaker, the new Panorama Appliance can run those models on video feeds from networked or network-enabled cameras.
Soon, AWS expects to have the Panorama SDK that can be used by device manufacturers to build Panorama-enabled devices.
Amazon has already pitched surveillance technologies to developers and the enterprise before. Back in 2017, the company unveiled DeepLens, which it began selling one year later. It was a way for developers to build prototype machine learning models and for Amazon to get comfortable with different ways of commercializing computer vision capabilities.
As we wrote in 2018:
DeepLens is deeply integrated with the rest of AWS’s services. Those include the AWS IoT service Greengrass, which you use to deploy models to DeepLens, for example, but also SageMaker, Amazon’s newest tool for building machine learning models… Indeed, if all you want to do is run one of the pre-built samples that AWS provides, it shouldn’t take you more than 10 minutes to set up … DeepLens and deploy one of these models to the camera. Those project templates include an object detection model that can distinguish between 20 objects (though it had some issues with toy dogs, as you can see in the image above), a style transfer example to render the camera image in the style of van Gogh, a face detection model and a model that can distinguish between cats and dogs and one that can recognize about 30 different actions (like playing guitar, for example). The DeepLens team is also adding a model for tracking head poses. Oh, and there’s also a hot dog detection model.
Amazon has had a lot of experience (and controversy) when it comes to the development of machine learning technologies for video. The company’s Rekognition software sparked protests and pushback which led to a moratorium on the use of the technology.
And the company has tried to incorporate more machine learning capabilities into its consumer facing Ring cameras as well.
Still, enterprises continue to clamor for new machine learning-enabled video recognition technologies for security, safety, and quality control. Indeed, as the COVID-19 pandemic drags on, new protocols around building use and occupancy are being adopted to not only adapt to the current epidemic, but plan ahead for spaces and protocols that can help mitigate the severity of the next one.
Intel and Nvidia chips power a supercomputing center that tracks people in a place where government suppresses minorities, raising questions about the tech industry’s responsibility.
You wait ages for foot scanning startups to help with the tricky fit issue that troubles online shoe shopping and then two come along at once: Launching today in time for Black Friday sprees is Xesto — which like Neatsy, which we wrote about earlier today, also makes use of the iPhone’s TrueDepth camera to generate individual 3D foot models for shoe size recommendations.
The Canadian startup hasn’t always been focused on feet. It has a long-standing research collaboration with the University of Toronto, alma mater of its CEO and co-founder Sophie Howe (its other co-founder and chief scientist, Afiny Akdemir, is also pursuing a Math PhD there) — and was actually founded back in 2015 to explore business ideas in human computer interaction.
But Howe tells us it moved into mobile sizing shortly after the 2017 launch of the iPhone X — which added a 3D depth camera to Apple’s smartphone. Since then Apple has added the sensor to additional iPhone models, pushing it within reach of a larger swathe of iOS users. So you can see why startups are spying a virtual fit opportunity here.
“This summer I had an aha! moment when my boyfriend saw a pair of fancy shoes on a deep discount online and thought they would be a great gift. He couldn’t remember my foot length at the time, and knew I didn’t own that brand so he couldn’t have gone through my closet to find my size,” says Howe. “I realized in that moment shoes as gifts are uncommon because they’re so hard to get correct because of size, and no one likes returning and exchanging gifts. When I’ve bought shoes for him in the past, I’ve had to ruin the surprise by calling him – and I’m not the only one. I realized in talking with friends this was a feature they all wanted without even knowing it… Shoes have such a cult status in wardrobes and it is time to unlock their gifting potential!”
Howe slid into this TechCrunch writer’s DMs with the eye-catching claim that Xesto’s foot-scanning technology is more accurate than Neatsy’s — sending a Xesto scan of her foot compared to Neatsy’s measure of it to back up the boast. (Aka: “We are under 1.5 mm accuracy. We compared against Neatsy right now and they are about 1.5 cm off of the true size of the app,” as she put it.)
Another big difference is Xesto isn’t selling any shoes itself. Nor is it interested in just sneakers; its shoe-type agnostic. If you can put it on your feet it wants to help you find the right fit, is the idea.
Right now the app is focused on the foot scanning process and the resulting 3D foot models — showing shoppers their feet in a 3D point cloud view, another photorealistic view as well as providing granular foot measurements.
There’s also a neat feature that lets you share your foot scans so, for example, a person who doesn’t have their own depth sensing iPhone could ask to borrow a friend’s to capture and takeaway scans of their own feet.
Helping people who want to be bought (correctly fitting) shoes as gifts is the main reason they’ve added foot scan sharing, per Howe — who notes shoppers can create and store multiple foot profiles on an account “for ease of group shopping”.
“Xesto is solving two problems: Buying shoes [online] for yourself, and buying shoes for someone else,” she tells TechCrunch. “Problem 1: When you buy shoes online, you might be unfamiliar with your size in the brand or model. If you’ve never bought from a brand before, it is very risky to make a purchase because there is very limited context in selecting your size. With many brands you translate your size yourself.
“Problem 2: People don’t only buy shoes for themselves. We enable gift and family purchasing (within a household or remote!) by sharing profiles.”
Xesto is doing its size predictions based on comparing a user’s (<1.5mm accurate) foot measurements to brands’ official sizing guidelines — with more than 150 shoe brands currently supported.
Howe says it plans to incorporate customer feedback into these predictions — including by analyzing online reviews where people tend to specify if a particular shoe sizes larger or smaller than expected. So it’s hoping to be able to keep honing the model’s accuracy.
“What we do is remove the uncertainty of finding your size by taking your 3D foot dimensions and correlate that to the brands sizes (or shoe model, if we have them),” she says. “We use the brands size guides and customer feedback to make the size recommendations. We have over 150 brands currently supported and are continuously adding more brands and models. We also recommend if you have extra wide feet you read reviews to see if you need to size up (until we have all that data robustly gathered).”
Asked about the competitive landscape, given all this foot scanning action, Howe admits there’s a number of approaches trying to help with virtual shoe fit — such as comparative brand sizing recommendations or even foot scanning with pieces of paper. But she argues Xesto has an edge because of the high level of detail of its 3D scans — and on account of its social sharing feature. Aka this is an app to make foot scans you can send your bestie for shopping keepsies.
“What we do that is unique is only use 3D depth data and computer vision to create a 3D scan of the foot with under 1.5mm accuracy (unmatched as far as we’ve seen) in only a few minutes,” she argues. “We don’t ask you any information about your feet, or to use a reference object. We make size recommendations based on your feet alone, then let you share them seamlessly with loved ones. Size sharing is a unique feature we haven’t seen in the sizing space that we’re incredibly excited about (not only because we will get more shoes as gifts :D).”
Xesto’s iOS app is free for shoppers to download. It’s also entirely free to create and share your foot scan in glorious 3D point cloud — and will remain so according to Howe. The team’s monetization plan is focused on building out partnerships with retailers, which is on the slate for 2021.
“Right now we’re not taking any revenue but next year we will be announcing partnerships where we work directly within brands ecosystems,” she says, adding: “[We wanted to offer] the app to customers in time for Black Friday and the holiday shopping season. In 2021, we are launching some exciting initiatives in partnership with brands. But the app will always be free for shoppers!”
Since being founded around five years ago, Howe says Xesto has raised a pre-seed round from angel investors and secured national advanced research grants, as well as taking in some revenue over its lifetime. The team has one patent granted and one pending for their technologies, she adds.
U.S.-based startup Neatsy AI is using the iPhone’s depth-sensing FaceID selfie camera as a foot scanner to capture 3D models for predicting a comfortable sneaker fit.
Its app, currently soft launched for iOS but due to launch officially next month, asks the user a few basic questions about sneaker fit preference before walking through a set of steps to capture a 3D scan of their feet using the iPhone’s front-facing camera. The scan is used to offer personalized fit predictions for a selection of sneakers offered for sale in-app — displaying an individualized fit score (out of five) in green text next to each sneaker model.
Shopping for shoes online can lead to high return rates once buyers actually get to slip on their chosen pair, since shoe sizing isn’t standardized across different brands. That’s the problem Neatsy wants its AI to tackle by incorporating another more individual fit signal into the process.
The startup, which was founded in March 2019, has raised $400K in pre-seed funding from angel investors to get its iOS app to market. The app is currently available in the US, UK, Germany, France, Italy, Spain, Netherlands, Canada and Russia.
Neatsy analyzes app users’ foot scans using a machine learning model it’s devised to predict a comfy fit across a range of major sneaker brands — currently including Puma, Nike, Jordan Air and Adidas — based on scanning the insoles of sneakers, per CEO and founder Artem Semyanov.
He says they’re also factoring in the material shoes are made of and will be honing the algorithm on an ongoing basis based on fit feedback from users. (The startup says it’s secured a US patent for its 3D scanning tech for shoe recommendations.)
The team tested the algorithm’s efficiency via some commercial pilots this summer — and say they were able to demonstrate a 2.7x reduction in sneaker return rates based on size, and a 1.9x decrease in returns overall, for a focus group with 140 respondents.
Handling returns is clearly a major cost for online retailers — Neatsy estimates that sneaker returns specifically rack up $30BN annually for ecommerce outlets, factoring in logistics costs and other factors like damaged boxes and missing sneakers.
“All in all, shoe ecommerce returns vary among products and shops between 30% and 50%. The most common reasons for this category are fit & size mismatch,” says Semyanov, who headed up the machine learning team at Prism Labs prior to founding Neatsy.
“According to Zappos, customers who purchase its most expensive footwear ultimately return ~50% of everything they buy. 70% online shoppers make returns each year. Statista estimates return deliveries will cost businesses $550 billion by 2020,” he tells us responding to questions via email.
“A 2019 survey from UPS found that, for 73% of shoppers, the overall returns experience impacts how likely they are to purchase from a given retailer again, and 68% say the experience impacts their overall perceptions of the retailer. That’s the drama here!
“Retailers are forced to accept steep costs of returns because otherwise, customers won’t buy. Vs us who want to treat the main reasons of returns rather than treating the symptoms.”
While ecommerce giants like Amazon address this issue by focusing on logistics to reducing friction in the delivery process, speeding up deliveries and returns so customers spend less time waiting to get the right stuff, scores of startups have been trying to tackle size and fit with a variety of digital (and/or less high tech) tools over the past five+ years — from 3D body models to ‘smart’ sizing suits or even brand- and garment-specific sizing tape (Nudea‘s fit tape for bras) — though no one has managed to come up with a single solution that works for everything and everyone. And a number of these startups have deadpooled or been acquired by ecommerce platforms without a whole lot to show for it.
While Neatsy is attempting to tackle what plenty of other founders have tried to do on the fit front, it is at least targeting a specific niche (sneakers) — a relatively narrow focus that may help it hone a useful tool.
It’s also able to lean on mainstream availability of the iPhone’s sensing hardware to get a leg up. (Whereas a custom shoe design startup that’s been around for longer, Solely Original, has offered custom fit by charging a premium to send out an individual fit kit.)
But even zeroing in on sneaker comfort, Neatsy’s foot scanning process does require the user to correctly navigate quite a number of steps (see the full flow in the below video). Plus you need to have a pair of single-block colored socks handy (stripy sock lovers are in trouble). So it’s not a two second process, though the scan only has to be done once.
At the time of writing we hadn’t been able to test Neatsy’s scanning process for ourselves as it requires an iPhones with a FaceID depth-sensing camera. On this writer’s 2nd-gen iPhone SE, the app allowed me to swipe through each step of the scan instruction flow but then hung at what should have been the commencement of scanning — displaying a green outline template of a left foot against a black screen.
This is a bug the team said they’ll be fixing so the scanner gets turned off entirely for iPhone models that don’t have the necessary hardware. (Its App Store listing states its compatible with iPhone SE (2nd generation), though doesn’t specify the foot scan feature isn’t.)
While the current version of Neatsy’s app is a direct to consumer ecommerce play, targeting select sneaker models at app savvy Gen Z/Millennials, it’s clearly intended as a shopfront for retailers to check out the technology.
When as ask about this Semyanov confirms its longer term ambition is for its custom fit model to become a standard piece of the ecommerce puzzle.
“Neatsy app is our fastest way to show the world our vision of what the future online shop should be,” he tells TechCrunch. “It attracts users to shops and we get revenue share when users buy sneakers via us. The app serves as a new low-return sales channel for a retailer and as a way to see the economic effect on returns by themselves.
“Speaking long term we think that our future is B2B and all ecommerce shops would eventually have a fitting tech, we bet it will be ours. It will be the same as having a credit card payment integration in your online shop.”
Chooch.ai, a startup that hopes to bring computer vision more broadly to companies to help them identify and tag elements at high speed, announced a $20 million Series A today.
Vickers Venture Partners led the round with participation from 212, Streamlined Ventures, Alumni Ventures Group, Waterman Ventures and several other unnamed investors. Today’s investment brings the total raised to $25.8 million, according to the company.
“Basically we s