Machine learning programs have recently made huge advances. Stephen Marche tested one against Shakespeare’s collected works, to see if it could help him figure out which of the several versions of Hamlet’s soliloquy was most likely what the playwright intended.
New York-based Blackbird.AI has closed a $10 million Series A as it prepares to launched the next version of its disinformation intelligence platform this fall.
The Series A is led by Dorilton Ventures, along with new investors including Generation Ventures, Trousdale Ventures, StartFast Ventures and Richard Clarke, former chief counter-terrorism advisor for the National Security Council. Existing investor NetX also participated.
Blackbird says it’ll be used to scale up to meet demand in new and existing markets, including by expanding its team and spending more on product dev.
The 2017-founded startup sells software as a service targeted at brands and enterprises managing risks related to malicious and manipulative information — touting the notion of defending the “authenticity” of corporate marketing.
It’s applying a range of AI technologies to tackle the challenge of filtering and interpreting emergent narratives from across the Internet to identify disinformation risks targeting its customers. (And, for the record, this Blackbird is no relation to an earlier NLP startup, called Blackbird, which was acquired by Etsy back in 2016.)
Blackbird AI is focused on applying automation technologies to detect malicious/manipulative narratives — so the service aims to surface emerging disinformation threats for its clients, rather than delving into the tricky task of attribution. On that front it’s only looking at what it calls “cohorts” (or “tribes”) of online users — who may be manipulating information collectively, for a shared interest or common goal (talking in terms of groups like antivaxxers or “bitcoin bros”).
Blackbird CEO and co-founder Wasim Khaled says the team has chalked up five years of R&D and “granular model development” to get the product to where it is now.
“In terms of technology the way we think about the company today is an AI-driven disinformation and narrative intelligence platform,” he tells TechCrunch. “This is essentially the efforts of five years of very in-depth, ears to the ground research and development that has really spanned people everywhere from the comms industry to national security to enterprise and Fortune 500, psychologists, journalists.
“We’ve just been non-stop talking to the stakeholders, the people in the trenches — to understand where their problem sets really are. And, from a scientific empirical method, how do you break those down into its discrete parts? Automate pieces of it, empower and enable the individuals that are trying to make decisions out of all of the information disorder that we see happening.”
The first version of Blackbird’s SaaS was released in November 2020 but the startup isn’t disclosing customer numbers as yet. v2 of the platform will be launched this November, per Khaled.
Also today it’s announcing a partnership with PR firm, Weber Shandwick, to provide support to customers on how to respond to specific malicious messaging that could impact their businesses and which its platform has flagged up as an emerging risk.
Disinformation has of course become a much labelled and discussed feature of online life in recent years, although it’s hardly a new (human) phenomenon. (See, for example, the orchestrated airbourne leaflet propaganda drops used during war to spread unease among enemy combatants and populations). However it’s fair to say that the Internet has supercharged the ability of intentionally bad/bogus content to spread and cause reputational and other types of harms.
Studies show the speed of online travel of ‘fake news’ (as this stuff is sometimes also called) is far greater than truthful information. And there the ad-funded business models of mainstream social media platforms are implicated since their commercial content-sorting algorithms are incentivized to amplify stuff that’s more engaging to eyeballs, which isn’t usually the grey and nuanced truth.
Stock and crypto trading is another growing incentive for spreading disinformation — just look at the recent example of Walmart targeted with a fake press release suggesting the retailer was about to accept litecoin.
All of which makes countering disinformation look like a growing business opportunity.
Earlier this summer, for example, another stealthy startup in this area, ActiveFence, uncloaked to announce a $100M funding round. Others in the space include Primer and Yonder (previously New Knowledge), to name a few.
While some other earlier players in the space got acquired by some of the tech giants wrestling with how to clean up their own disinformation-ridden platforms — such as UK-based Fabula AI, which was bought by Twitter in 2019.
Another — Bloomsbury AI — was acquired by Facebook. And the tech giant now routinely tries to put its own spin on its disinformation problem by publishing reports that contain a snapshot of what it dubs “coordinated inauthentic behavior” that it’s found happening on its platforms (although Facebook’s selective transparency often raises more questions than it answers.)
The problems created by bogus online narratives ripple far beyond key host and spreader platforms like Facebook — with the potential to impact scores of companies and organizations, as well as democratic processes.
But while disinformation is a problem that can now scale everywhere online and affect almost anything and anyone, Blackbird is concentrating on selling its counter tech to brands and enterprises — targeting entities with the resources to pay to shrink reputational risks posed by targeted disinformation.
Per Khaled, Blackbird’s product — which consists of an enterprise dashboard and an underlying data processing engine — is not just doing data aggregation, either; the startup is in the business of intelligently structuring the threat data its engine gathers, he says, arguing too that it goes further than some rival offerings that are doing NLP (natural language processing) plus maybe some “light sentiment analysis”, as he puts it.
Although NLP is also key area of focus for Blackbird, along with network analysis — and doing things like looking at the structure of botnets.
But the suggestion is Blackbird goes further than the competition by merit of considering a wider range of factors to help identify threats to the “integrity” of corporate messaging. (Or, at least, that’s its marketing pitch.)
Khaled says the platform focuses on five “signals” to help it deconstruct the flow of online chatter related to a particular client and their interests — which he breaks down thusly: Narratives, networks, cohorts, manipulation and deception. And for each area of focus Blackbird is applying a cluster of AI technologies, according to Khaled.
But while the aim is to leverage the power of automation to tackle the scale of the disinformation challenge that businesses now face, Blackbird isn’t able to do this purely with AI alone; expert human analysis remains a component of the service — and Khaled notes that, for example, it can offer customers (human) disinformation analysts to help them drill further into their disinformation threat landscape.
“What really differentiates our platform is we process all five of these signals in tandem and in near real-time to generate what you can think of almost as a composite risk index that our clients can weigh, based on what might be most important to them, to rank the most important action-oriented information for their organization,” he says.
“Really it’s this tandem processing — quantifying the attack on human perception that we see happening; what we think of as a cyber attack on human perception — how do you understand when someone is trying to shift the public’s perception? About a topic, a person, an idea. Or when we look at corporate risk, more and more, we see when is a group or an organization or a set of accounts trying to drive public scrutiny against a company for a particular topic.
“Sometimes those topics are already in the news but the property that we want our customers or anybody to understand is when is something being driven in a manipulative manner? Because that means there’s an incentive, a motive, or an unnatural set of forces… acting upon the narrative being spread far and fast.”
“We’ve been working on this, and only this, and early on decided to do a purpose-built system to look at this problem. And that’s one of the things that really set us apart,” he also suggests, adding: “There are a handful of companies that are in what is shaping up to be a new space — but often some of them were in some other line of work, like marketing or social and they’ve tried to build some models on top of it.
“For bots — and for all of the signals we talked about — I think the biggest challenge for many organizations if they haven’t completely purpose built from scratch like we have… you end up against certain problems down the road that prevent you from being scalable. Speed becomes one of the biggest issues.
“Some of the largest organizations we’ve talked to could in theory product the signals — some of the signals that I talked about before — but the lift might take them ten to 12 days. Which makes it really unsuited for anything but the most forensic reporting, after things have kinda gone south… What you really need it in is two minutes or two seconds. And that’s where — from day one — we’ve been looking to get.”
As well as brands and enterprises with reputational concerns — such as those whose activity intersects with the ESG space; aka ‘environmental, social and governance’ — Khaled claims investors are also interested in using the tool for decision support, adding: “They want to get the full picture and make sure they’re not being manipulated.”
At present, Blackbird’s analysis focuses on emergent disinformation threats — aka “nowcasting” — but the goal is also to push into disinformation threat predictive — to help prepare clients for information-related manipulation problems before they occur. Albeit there’s no timeframe for launching that component yet.
“In terms of counter measurement/mitigation, today we are by and large a detection platform, starting to bridge into predictive detection as well,” says Khaled, adding: “We don’t take the word predictive lightly. We don’t just throw it around so we’re slowly launching the pieces that really are going to be helpful as predictive.
“Our AI engine trying to tell [customers] where things are headed, rather than just telling them the moment it happens… based on — at least from our platform’s perspective — having ingested billions of posts and events and instances to then pattern match to something similar to that that might happen in the future.”
“A lot of people just plot a path based on timestamps — based on how quickly something is picking up. That’s not prediction for Blackbird,” he also argues. “We’ve seen other organizations call that predictive; we’re not going to call that predictive.”
In the nearer term, Blackbird has some “interesting” counter measurement tech to assist teams in its pipeline, coming in Q1 and Q2 of 2022, Khaled adds.
Sorcero announced Thursday a $10 million Series A round of funding to continue scaling its medical and technical language intelligence platform.
The latest funding round comes as the company, headquartered in Washington, D.C. and Cambridge, Massachusetts, sees increased demand for its advanced analytics from life sciences and technical companies. Sorcero’s natural language processing platform makes it easier for subject-matter experts to find answers to their questions to aid in better decision making.
CityRock Venture Partners, the growth fund of H/L Ventures, led the round and was joined by new investors Harmonix Fund, Rackhouse, Mighty Capital and Leawood VC, as well as existing investors, Castor Ventures and WorldQuant Ventures. The new investment gives Sorcero a total of $15.7 million in funding since it was founded in 2018.
Prior to starting Sorcero, Dipanwita Das, co-founder and CEO, told TechCrunch she was working in public policy, a place where scientific content is useful, but often a source of confusion and burden. She thought there had to be a more effective way to make better decisions across the healthcare value chain. That’s when she met co-founders Walter Bender and Richard Graves and started the company.
“Everything is in service of subject-matter experts being faster, better and less prone to errors,” Das said. “Advances of deep learning with accuracy add a lot of transparency. We are used by science affairs and regulatory teams whose jobs it is to collect scientific data and effectively communicate it to a variety of stakeholders.”
The total addressable market for language intelligence is big — Das estimated it to be $42 billion just for the life sciences sector. Due to the demand, the co-founders have seen the company grow at 324% year over year since 2020, she added.
Raising a Series A enables the company to serve more customers across the life sciences sector. The company will invest in talent in both engineering and on the commercial side. It will also put some funds into Sorcero’s go-to-market strategy to go after other use cases.
In the next 12 to 18 months, a big focus for the company will be scaling into product market fit in the medical affairs and regulatory space and closing new partnerships.
Oliver Libby, partner at CityRock Venture Partners, said Sorcero’s platform “provides the rails for AI solutions for companies” that have traditionally found issues with AI technologies as they try to integrate data sets that are already in existence in order to run analysis effectively on top of that.
Rather than have to build custom technology and connectors, Sorcero is “revolutionizing it, reducing time and increasing accuracy,” and if AI is to have a future, it needs a universal translator that plugs into everything, he said.
“One of the hallmarks in the response to COVID was how quickly the scientific community had to do revolutionary things,” Libby added. “The time to vaccine was almost a miracle of modern science. One of the first things they did was track medical resources and turn them into a hook for pharmaceutical companies. There couldn’t have been a better use case for Sorcero than COVID.”
Grammarly, the popular auto editing tool, announced the release of Grammarly for Developers today. The company is starting this effort with the Text Editor SDK (software development kit), which enables programmers to embed Grammarly text editing functionality into any web application.
Rob Brazier, head of product and platform at Grammarly, says that the beta release of this SDK gives developers access to the full power of Grammarly automated editing with a couple of lines of code.”Literally in just a couple lines of HTML, [developers] can add Grammarly’s assistance to their application, and they get a native Grammarly experiences available to all of their users without the users needing to install or register Grammarly,” Brazier told me.
Underneath the hood, these developers are getting access to highly sophisticated natural language processing (NLP) technology without requiring any artificial intelligence understanding or experience whatsoever. Instead, developers can take advantage of the work that Grammarly has already done.
While users of the target application don’t need to be Grammarly customers (and that is in fact the idea), if they do happen to be, they can log into their Grammarly accounts and access all of the functionality that comes with that. “If their users have a Grammarly subscription, those users can link their Grammarly accounts into the developer’s application. They can sign in with Grammarly and unlock the additional features of their particular subscriptions [directly] in that application,” he said.
Brazier said that because this is a starting point, the company wanted to keep it basic, get feedback on the beta and then add additional capabilities in the future. “We wanted to start with the simplest possible way of giving access to this capability to the greatest number of users. So that’s why we started with a pretty simple product. I think it’ll evolve over time and grow in sophistication, but it is really just a couple lines of code and you’re up and running,” he said.
This is the company’s first dip into the developer tool space, allowing programmers to access Grammarly functionality and embed it in their applications. This is not unlike the approach Zoom took last year when it released an SDK to tap into video services (although Zoom is much further along on this developer tool journey). As companies like Grammarly and Zoom grow in popularity, it seems the next logical step is to expose the strengths of the platform, in this case text editing, to let developers take advantage of it. In fact, Salesforce was the first to implement this idea in 2007 when it launched Force.com.
This approach also will potentially provide another source of revenue for Grammarly beyond the subscription versions of the product, although Brazier says it’s too early to say what shape that will take. Regardless, today’s announcement is just the first step in a broader strategy to expose different parts of the platform to developers and enable them to take advantage of all the work Grammarly’s engineers put into the platform. Interested developers can apply to be part of the beta program.
On average, men and women speak roughly 15,000 words per day. We call our friends and family, log into Zoom for meetings with our colleagues, discuss our days with our loved ones, or if you’re like me, you argue with the ref about a bad call they made in the playoffs.
Hospitality, travel, IoT and the auto industry are all on the cusp of leveling-up voice assistant adoption and the monetization of voice. The global voice and speech recognition market is expected to grow at a CAGR of 17.2% from 2019 to reach $26.8 billion by 2025, according to Meticulous Research. Companies like Amazon and Apple will accelerate this growth as they leverage ambient computing capabilities, which will continue to push voice interfaces forward as a primary interface.
As voice technologies become ubiquitous, companies are turning their focus to the value of the data latent in these new channels. Microsoft’s recent acquisition of Nuance is not just about achieving better NLP or voice assistant technology, it’s also about the trove of healthcare data that the conversational AI has collected.
Our voice technologies have not been engineered to confront the messiness of the real world or the cacophony of our actual lives.
Google has monetized every click of your mouse, and the same thing is now happening with voice. Advertisers have found that speak-through conversion rates are higher than click-through conversation rates. Brands need to begin developing voice strategies to reach customers — or risk being left behind.
Voice tech adoption was already on the rise, but with most of the world under lockdown protocol during the COVID-19 pandemic, adoption is set to skyrocket. Nearly 40% of internet users in the U.S. use smart speakers at least monthly in 2020, according to Insider Intelligence.
Yet, there are several fundamental technology barriers keeping us from reaching the full potential of the technology.
The steep climb to commercializing voice
By the end of 2020, worldwide shipments of wearable devices rose 27.2% to 153.5 million from a year earlier, but despite all the progress made in voice technologies and their integration in a plethora of end-user devices, they are still largely limited to simple tasks. That is finally starting to change as consumers demand more from these interactions, and voice becomes a more essential interface.
In 2018, in-car shoppers spent $230 billion to order food, coffee, groceries or items to pick up at a store. The auto industry is one of the earliest adopters of voice AI, but in order to really capture voice technology’s true potential, it needs to become a more seamless, truly hands-free experience. Ambient car noise still muddies the signal enough that it keeps users tethered to using their phones.
“Even with its vast local talent, it seems Israel still has many hurdles to overcome in order to become a global fintech hub. [ … ] Having that said, I don’t believe any of these obstacles will prevent Israel from generating disruptive global fintech startups that will become game-changing businesses.”
I wrote that back in 2018, when I was determined to answer whether Israel had the potential to become a global fintech hub. Suffice to say, this prediction from three years ago has become a reality.
In 2019, Israeli fintech startups raised over $1.8 billion; in 2020, they were said to have raised $1.48 billion despite the pandemic. Just in the first quarter of 2021, Israeli fintech startups raised $1.1 billion, according to IVC Research Center and Meitar Law Offices.
It’s then no surprise that Israel now boasts over a dozen fintech unicorns in sectors such as payments, insurtech, lending, banking and more, some of which reached the desired status just in the beginning of 2021 — like Melio and Papaya Global, which raised $110 million and $100 million, respectively.
Over the years I’ve been fortunate to invest (both as a venture capitalist and personally) in successful early-stage fintech companies in the U.S., Israel and emerging markets — Alloy, Eave, MoneyLion, Migo, Unit, AcroCharge and more.
The major shifts and growth of fintech globally over these years has been largely due to advanced AI-based technologies, heightened regulatory scrutiny, a more innovative and adaptive approach among financial institutions to build partnerships with fintechs, and, of course, the COVID pandemic, which forced consumers to transact digitally.
The pandemic pushed fintechs to become essential for business survival, acting as the main contributor of the rapid migration to digital payments.
So what is it about Israeli-founded fintech startups that stand out from their scaling neighbors across the pond? Israeli founders first and foremost have brought to the table a distinct perspective and understanding of where the gaps exist within their respective focus industries — whether it was Hippo and Lemonade in the world of property and casualty insurance, Rapyd and Melio in the world of business-to-business payments, or Earnix and Personetics in the world of banking data and analytics.
This is even more compelling given that many of these Israeli founders did not grow within financial services, but rather recognized those gaps, built their know-how around the industry (in some cases by hiring or partnering with industry experts and advisers during their ideation phase, strengthening their knowledge and validation), then sought to build more innovative and customer-focused solutions than most financial institutions can offer.
Having this in mind, it is becoming clearer that the Israeli fintech industry has slowly transitioned into a mature ecosystem with a combination of local talent, which now has expertise from a multitude of local fintechs that have scaled to success; a more global network of banking and insurance partners that have recognized the Israeli fintech disruptors; and the smart fintech -focused venture capital to go along with it. It’s a combination that will continue to set up Israeli fintech founders for success.
In addition, a major contributor to the fintech industry comes from the technological side. It is never enough to reach unicorn status with just the tech on the back end.
What most likely differentiates Israeli fintech from other ecosystems is the strong technological barriers and infrastructure built from the ground up, which then, of course, leads to the ability to be more customized, compliant, secured, etc. If I had to bet on where I believe Israeli fintech startups could become market leaders, I’d go with the following.
Voice technologies have come a long way over the years; where once you knew you were talking to a robot, now financial institutions and applications offer a fully automated experience that sounds and feels just like a company employee.
Israel has shown growing success in the world of voice tech, with companies like Gong.io providing insights for remote sales teams; Bonobo (acquired by Salesforce) offering insights from customer support calls, texts and other interactions; and Voca.ai (acquired by Snapchat) offering an automated support agent to replace the huge costs of maintaining call centers.
Silicon Valley was once one of the most productive regions in the country for the defense industry, churning out chips and technologies that helped the United States overtake the Soviet Union during the Cold War. Since then, the region has been known far less for silicon and defense than for the consumer internet products of Google, Facebook and Netflix.
A small number of startups though are attempting to revitalize that important government-industry nexus as the rise of China pushes more defense planners in Washington to double down on America’s technical edge. Vannevar Labs is one of this new crop, and it has hit some new milestones in its quest to displace traditional defense contractors with Silicon Valley entrepreneurial acumen.
I last chatted with the company just as it was debuting in late 2019, having raised a $4.5 million seed. The company has been quiet and heads down the past two years as it developed a product and traction within the defense establishment. Now it’s ready to reveal a bit more of what all that work has culminated in.
First, the company officially launched its Vannevar Decrypt product in January of this year. It’s focused on foreign language natural language processing, organizing overseas data and resources that are collected by the intelligence community and then immediately translating and interpreting those documents for foreign policy decision-makers. CEO and co-founder Brett Granberg said that the product “went from one deployment to a dozen adoptions.”
Second, the company raised a $12 million Series A investment in May from Costanoa Ventures and Point72 with General Catalyst participating. Costanoa and GC co-led the startup’s seed round.
Finally, the company has been on a hiring spree. The team has grown into a crew of 20 employees, and the firm last week brought on Scott Sanders to lead business development. Sanders was one of the earliest employees at Anduril, and had spent several years at the company. Vannevar also added John Doyle, a long-time Palantir employee who was head of its national security business, onto its board according to Granberg. Today, the team is equally split between national security folks and technologists, and he says that the team is set to double this year.
With a few years of hindsight, Granberg says that he has refined what he considers the best model for defense tech startups to break into the hardscrabble market at the Pentagon and across Northern Virginia.
First, there needs to be incredible focus on getting access to actual end users and learning their work. The functions that defense and intelligence personnel perform are completely different from operations in the commercial economy, and trying to translate what works at a large corporation into defense is a fool’s errand. “You need to have both the DNA of understanding new technology and the DNA of deeply understanding a lot of different use cases within DoD,” Granberg said, referencing the Department of Defense.
That’s directly informed how Decrypt has developed over time. “We started focusing on the counter-terrorism space, and as the government moved away from counter-terrorism, we started moving to the foreign actors that were important,” he said. “Once we have our first couple of deployments, we are able to iterate very, very quickly.”
He also strongly eschews a popular view in defense procurement circles that there are “dual-use” technologies that can be used equally well in commercial and defense applications. “Some of the most important mission problems where the government spends the most money and has the most interest,” he explained, are also contexts where commercial off-the-shelf products (dubbed COTS in the industry parlance) are least useful. He says startups targeting defense simply cannot split their bandwidth by also trying to learn commercial use cases.
In fact, he went so far to predict that “you are going to see a lot of companies that have raised a lot of money that will fizzle out in the coming years” because they just can’t nail the dual-use model well.
Second, he argues that defense tech startups need to move beyond the model that each company should work on one platform, and instead move to an organizational model where a company offers multiple products to reach scale. Each product has the potential to reach “a couple of hundred million in revenue” according to Granberg, but it is hard to expand a company’s size if it doesn’t parallelize product development.
To that end, Granberg said that he pushes Vannevar Labs to always be exploring new product lines for growth. “Decrypt is our first product [but]10% of our energy is in new product efforts,” he said. “I can imagine when we are three to four years down the line… it might be 9-10 products.” He said that the one platform approach might have worked for Palantir, which ironically, is the major winner in the defense tech space the last few years. But newer companies like Anduril and Shield AI have been designed around product line expansion.
Finally, noting those other companies, Granberg believes there is something of a collective benefit as each startup makes headway in the defense sector. “There is this theory in our space that we don’t view ourselves as competitors — if one of us does well, we all do well,” he said. Given the varied mission requirements of different agencies and the absolute massive scale of budgets in this field, startups actually have a lot of independent terrain to explore, even if they come up against the big legacy defense contractors on a regular basis.
As for Vannevar Labs, its next goal is to turn its Decrypt product into a program of record, which would guarantee it a certain level of sales and revenue for potentially years into the future. That’s a huge bar to leap, but would be a turning point in the company’s long-term trajectory.
We’ve spent the past few weeks burning copious amounts of AWS compute time trying to invent an algorithm to parse Ars’ front-page story headlines to predict which ones will win an A/B test—and we learned a lot. One of the lessons is that we—and by “we,” I mainly mean “me,” since this odyssey was more or less my idea—should probably have picked a less, shall we say, ambitious project for our initial outing into the machine-learning wilderness. Now, a little older and a little wiser, it’s time to reflect on the project and discuss what went right, what went somewhat less than right, and how we’d do this differently next time.
Our readers had tons of incredibly useful comments, too, especially as we got into the meaty part of the project—comments that we’d love to get into as we discuss the way things shook out. The vagaries of the edit cycle meant that the stories were being posted quite a bit after they were written, so we didn’t have a chance to incorporate a lot of reader feedback as we went, but it’s pretty clear that Ars has some top-shelf AI/ML experts reading our stories (and probably groaning out loud every time we went down a bit of a blind alley). This is a great opportunity for you to jump into the conversation and help us understand how we can improve for next time—or, even better, to help us pick smarter projects if we do an experiment like this again!
Our chat kicks off on Wednesday, July 28, at 1:00 pm Eastern Time (that’s 10:00 am Pacific Time and 17:00 UTC). Our three-person panel will consist of Ars Infosec Editor Emeritus Sean Gallagher and me, along with Amazon Senior Principal Technical Evangelist (and AWS expert) Julien Simon. If you’d like to register so that you can ask questions, use this link here; if you just want to watch, the discussion will be streamed on the Ars Twitter account and archived as an embedded video on this story’s page. Register and join in or check back here after the event to watch!
We may have bitten off more than we could chew, folks.
An Amazon engineer told me that when he heard what I was trying to do with Ars headlines, the first thing he thought was that we had chosen a deceptively hard problem. He warned that I needed to be careful about properly setting my expectations. If this was a real business problem… well, the best thing he could do was suggest reframing the problem from “good or bad headline” to something less concrete.
That statement was the most family-friendly and concise way of framing the outcome of my four-week, part-time crash course in machine learning. As of this moment, my PyTorch kernels aren’t so much torches as they are dumpster fires. The accuracy has improved slightly, thanks to professional intervention, but I am nowhere near deploying a working solution. Today, as I am allegedly on vacation visiting my parents for the first time in over a year, I sat on a couch in their living room working on this project and accidentally launched a model training job locally on the Dell laptop I brought—with a 2.4 GHz Intel Core i3 7100U CPU—instead of in the SageMaker copy of the same Jupyter notebook. The Dell locked up so hard I had to pull the battery out to reboot it.
We’re in phase three of our machine-learning project now—that is, we’ve gotten past denial and anger, and we’re now sliding into bargaining and depression. I’ve been tasked with using Ars Technica’s trove of data from five years of headline tests, which pair two ideas against each other in an “A/B” test to let readers determine which one to use for an article. The goal is to try to build a machine-learning algorithm that can predict the success of any given headline. And as of my last check-in, it was… not going according to plan.
I had also spent a few dollars on Amazon Web Services compute time to discover this. Experimentation can be a little pricey. (Hint: If you’re on a budget, don’t use the “AutoPilot” mode.)
We’d tried a few approaches to parsing our collection of 11,000 headlines from 5,500 headline tests—half winners, half losers. First, we had taken the whole corpus in comma-separated value form and tried a “Hail Mary” (or, as I see it in retrospect, a “Leeroy Jenkins“) with the Autopilot tool in AWS’ SageMaker Studio. This came back with an accuracy result in validation of 53 percent. This turns out to be not that bad, in retrospect, because when I used a model specifically built for natural-language processing—AWS’ BlazingText—the result was 49 percent accuracy, or even worse than a coin-toss. (If much of this sounds like nonsense, by the way, I recommend revisiting Part 2, where I go over these tools in much more detail.)
There’s a moment in any foray into new technological territory that you realize you may have embarked on a Sisyphean task. Staring at the multitude of options available to take on the project, you research your options, read the documentation, and start to work—only to find that actually just defining the problem may be more work than finding the actual solution.
Reader, this is where I found myself two weeks into this adventure in machine learning. I familiarized myself with the data, the tools, and the known approaches to problems with this kind of data, and I tried several approaches to solving what on the surface seemed to be a simple machine learning problem: Based on past performance, could we predict whether any given Ars headline will be a winner in an A/B test?
Things have not been going particularly well. In fact, as I finished this piece, my most recent attempt showed that our algorithm was about as accurate as a coin flip.
Technology plays a huge role in nearly every aspect of financial services today. As the world moved online, tools and infrastructure to help people manage their money and make payments have burgeoned the world over in the past decade.
With much of the finance world now leveraging technology to conduct business, predict trends and deliver services, financial services regulators are also developing new technologies to monitor markets, supervise financial institutions and conduct other administrative activities. The emergence of purpose-built technologies to facilitate regulator oversight has, over the past few years, garnered its own moniker of supervisory technology, or suptech.
Interest in suptech is proliferating across the globe thanks to a diverse set of prudential and conduct regulators. A sampling of regulators developing suptech include the FDIC, CFPB, FINRA and Federal Reserve in the U.S.; the U.K.’s FCA and Bank of England; the National Bank of Rwanda in Africa; as well as the ASIC, HKMA and MAS in Asia. Several “super regulators” are also engaged in suptech efforts such as the Bank of International Settlements, the Financial Stability Board and the World Bank.
The strides in suptech demonstrate that creative thinking coupled with experimentation and scalable, easily accessible technologies are jump-starting a new approach to regulation.
In this post, we’ll examine a few core suptech use cases, consider its future and explore the challenges facing regulators as the market matures. The uses are diverse, so we’ll focus on three key areas: regulatory reporting, machine-readable regulation, and market and conduct oversight.
A quick general note: Nearly every financial services regulator is engaged in some type of suptech activity and the use cases discussed in this article are intended as a sample, not a comprehensive list.
But what exactly is suptech?
As a preliminary matter, we should quickly survey a few definitions of suptech to frame our understanding. Both the World Bank and BIS have offered definitions that provide useful outlines for this discussion. The World Bank states that suptech “refers to the use of technology to facilitate and enhance supervisory processes from the perspective of supervisory authorities.” It’s a little circular, but helpful.
The BIS defines suptech as “the use of technology for regulatory, supervisory and oversight purposes.” This is a similarly loose definition that describes the broader scope better.
Regardless of differences on the margins, the “sup” in these suptech definitions acknowledges the primacy of the idea that regulators’ objectives are to oversee the conduct, structure, and health of the financial system. Suptech technologies facilitate related regulatory supervision and enforcement processes.
Regulatory reporting refers to a broad swath of activities such as financial firms providing trading data to regulatory authorities and regulators’ analysis of financial data or corporate information to determine the projected health or potential risks facing an institution or the market.
The MAS and FDIC are incorporating transactional and financial data reported by firms as a means to assess their financial viability. The MAS, in conjunction with BIS, has run tech sprints soliciting new ideas relating to regulatory reporting, while the FDIC has “a regulatory reporting solution that would allow ‘on-demand’ monitoring of banks as opposed to being constrained by ‘point-in-time’ reporting. This project is particularly targeted at smaller, community banks that provide only aggregated data on their financial health on a quarterly basis.”
The HKMA recently outlined its three-year plan for the development of suptech, which includes developing an approach to “network analysis.” The HKMA will analyze reporting data related to corporate shareholding and financial exposure to bring them “to life as network diagrams, so that the relationships between different entities become more apparent. Greater transparency of the connections and dependencies between banks and their customers will enable HKMA supervisors to detect early warning signals within the entire credit network.”
These reporting initiatives touch on a theme regulators have continuously struggled with: How to regulate markets and firms based on a reactive approach to historical data. Regulation and enforcement are often retrospective activities — examining past behavior and data to decide how to sanction an organization or develop a regulatory framework to govern a particular type of activity or financial product. This can result in an approach to regulation too rooted in past failures, which might lack the flexibility to anticipate or adapt to emerging risks or financial products.
The tectonic shifts to American culture and society due to the pandemic are far from over. One of the more glaring ones is that the U.S. labor market is going absolutely haywire.
Millions are unemployed, yet companies — from retail to customer service to airlines — can’t find enough workers. This perplexing paradox behind Uber price surges and waiting on an endless hold because your flight was canceled isn’t just inconvenient — it’s a loud and clear message from the post-pandemic American workforce. Many are underpaid, undervalued and underwhelmed in their current jobs, and are willing to change careers or walk away from certain types of work for good.
It’s worth noting that low-wage workers aren’t the only ones putting their foot down; white-collar quits are also at an all-time high. Extended unemployment benefits implemented during the pandemic may be keeping some workers on the sidelines, but employee burnout and job dissatisfaction are also primary culprits.
We have a wage problem and an employee satisfaction problem, and Congress has a long summer ahead of it to attempt to find a solution. But what are companies supposed to do in the meantime?
Adopting AI in manufacturing accelerated during the pandemic to deal with volatility in the supply chain, but now it must move from “pilot purgatory” to widespread implementation.
At this particular moment, businesses need a stopgap solution either until September, when COVID-19 relief and unemployment benefits are earmarked to expire, or something longer term and more durable that not only keeps the engine running but propels the ship forward. Adopting AI can be the key to both.
Declaring that we’re on the precipice of an AI awakening is probably nowhere near the most shocking thing you’ve read this year. But just a few short years ago, it would have frightened a vast number of people, as advances in automation and AI began to transform from a distant idea into a very personal reality. People were (and some holdouts remain) genuinely worried about losing their job, their lifeline, with visions of robots and virtual agents taking over.
But does this “AI takes jobs” storyline hold up in the cultural and economic moment we’re in?
Is AI really taking jobs if no one actually likes those jobs?
If this “labor shortage” unveils any silver lining, it’s our real-world version of the Sorting Hat. When you take money out of the equation on the question of employment, it’s opening our eyes to what work people find desirable and, more evidently, what’s not. Specifically, the manufacturing, retail and service industries are taking the hardest labor hits, underscoring that tasks associated with those jobs — repetitive duties, unrewarding customer service tasks and physical labor — are driving more and more potential workers away.
Adopting AI in manufacturing accelerated during the pandemic to deal with volatility in the supply chain, but now it must move from “pilot purgatory” to widespread implementation. The best use cases for AI in this industry are ones that help with supply chain optimization, including quality inspection, general supply chain management and risk/inventory management.
Most critically, AI can predict when equipment might fail or break, reducing costs and downtime to almost zero. Industry leaders believe that AI is not only beneficial for business continuity but that it can augment the work and efficiency of existing employees rather than displace them. AI can assist employees by providing real-time guidance and training, flagging safety hazards, and freeing them up to do less repetitive, low-skilled work by taking on such tasks itself, such as detecting potential assembly line defects.
In the manufacturing industry, this current labor shortage is not a new phenomenon. The industry has been facing a perception problem in the U.S. for a long time, mainly because young workers think manufacturers are “low tech” and low paying. AI can make existing jobs more attractive and directly lead to a better bottom line while also creating new roles for companies that attract subject-matter talent and expertise.
In the retail and service industries, arduous customer service tasks and low pay are leading many employees to walk out the door. Those that are still sticking it out have their hands tied because of their benefits, even though they are unhappy with the work. Conversational AI, which is AI that can interact with people in a human-like manner by leveraging natural language processing and machine learning, can relieve employees of many of the more monotonous customer experience interactions so they can take on roles focused on elevating retail and service brands with more cerebral, thoughtful human input.
Many retail and service companies adopted scripted chatbots during the pandemic to help with the large online volumes only to realize that chatbots operate on a fixed decision tree — meaning if you ask something out of context, the whole customer service process breaks down. Advanced conversational AI technologies are modeled on the human brain. They even learn as they go, getting more skilled over time, presenting a solution that saves retail and service employees from the mundane while boosting customer satisfaction and revenue.
Hesitancy and misconceptions about AI in the workplace have long been a barrier to widespread adoption — but companies experiencing labor shortages should consider where it can make their employees’ lives better and easier, which can only be a benefit for bottom-line growth. And it might just be the big break that AI needs.
Fixing workplace misconduct reporting is a mission that’s snagged London-based Vault Platform backing from Google’s AI focused fund, Gradient Ventures, which is the lead investor in an $8.2 million Series A that’s being announced today.
Other investors joining the round are Illuminate Financial, along with existing investors including Kindred Capital and Angular Ventures. Its $4.2M seed round was closed back in 2019.
Vault sells a suite of SaaS tools to enterprise-sized or large/scale-up companies to support them to pro-actively manage internal ethics and integrity issues. As well as tools for staff to report issues, data and analytics is baked into the platform — so it can support with customers’ wider audit and compliance requirements.
In an interview with TechCrunch, co-founder and CEO Neta Meidav said that as well as being wholly on board with the overarching mission to upgrade legacy reporting tools like hotlines provided to staff to try to surface conduct-related workplace risks (be that bullying and harassment; racism and sexism; or bribery, corruption and fraud), as you might expect Gradient Ventures was interested in the potential for applying AI to further enhance Vault’s SaaS-based reporting tool.
A feature of its current platform, called ‘GoTogether’, consists of an escrow system that allows users to submit misconduct reports to the relevant internal bodies but only if they are not the first or only person to have made a report about the same person — the idea being that can help encourage staff (or outsiders, where open reporting is enabled) to report concerns they may otherwise hesitate to, for various reasons.
Vault now wants to expand the feature’s capabilities so it can be used to proactively surface problematic conduct that may not just relate to a particular individual but may even affect a whole team or division — by using natural language processing to help spot patterns and potential linkages in the kind of activity being reported.
“Our algorithms today match on an alleged perpetrator’s identity. However many events that people might report on are not related to a specific person — they can be more descriptive,” explains Meidav. “For example if you are experiencing some irregularities in accounting in your department, for example, and you’re suspecting that there is some sort of corruption or fraudulent activity happening.”
“If you think about the greatest [workplace misconduct] disasters and crises that happened in recent years — the Dieselgate story at Volkswagen, what happened in Boeing — the common denominator in all these cases is that there’s been some sort of a serious ethical breach or failure which was observed by several people within the organization in remote parts of the organization. And the dots weren’t connected,” she goes on. “So the capacity we’re currently building and increasing — building upon what we already have with GoTogether — is the ability to connect on these repeated events and be able to connect and understand and read the human input. And connect the dots when repeated events are happening — alerting companies’ boards that there is a certain ‘hot pocket’ that they need to go and investigate.
“That would save companies from great risk, great cost, and essentially could prevent huge loss. Not only financial but reputational, sometimes it’s even loss to human lives… That’s where we’re getting to and what we’re aiming to achieve.”
There is the question of how defensible Vault’s GoTogether feature is — how easily it could be copied — given you can’t patent an idea. So baking in AI smarts may be a way to layer added sophistication to try to maintain a competitive edge.
“There’s some very sophisticated, unique technology there in the backend so we are continuing to invest in this side of our technology. And Gradient’s investment and the specific we’re receiving from Google now will only increase that element and that side of our business,” says Meidav when we ask about defensibility.
Commenting on the funding in a statement, Gradient Ventures founder and managing partner, Anna Patterson, added: “Vault tackles an important space with an innovative and timely solution. Vault’s application provides organizations with a data-driven approach to tackling challenges like occupational fraud, bribery or corruption incidents, safety failures and misconduct. Given their impressive team, technology, and customer traction, they are poised to improve the modern workplace.”
The London-based startup was only founded in 2018 — and while it’s most keen to talk about disrupting legacy hotline systems, which offer only a linear and passive conduit for misconduct reporting, there are a number of other startups playing in the same space. Examples include the likes of LA-based AllVoices, YC-backed Whispli, Hootsworth and Spot to name a few.
Competition seems likely to continue to increase as regulatory requirements around workplace reporting keep stepping up.
The incoming EU Whistleblower Protection Directive is one piece of regulation Vault expects will increase demand for smarter compliance solutions — aka “TrustTech”, as it seeks to badge it — as it will require companies of more than 250 employees to have a reporting solution in place by the end of December 2021, encouraging European businesses to cast around for tools to help shrink their misconduct-related risk.
She also suggests a platform solution can help bridge gaps between different internal teams that may need to be involved in addressing complaints, as well as helping to speed up internal investigations by offering the ability to chat anonymously with the original reporter.
Meidav also flags the rising attention US regulators are giving to workplace misconduct reporting — noting some recent massive awards by the SEC to external whistleblowers, such as the $28M paid out to a single whistleblower earlier this year (in relation to the Panasonic Avionics consultant corruption case).
She also argues that growing numbers of companies going public (such as via the SPAC trend, where there will have been reduced regulatory scrutiny ahead of the ‘blank check’ IPO) raises reporting requirements generally — meaning, again, more companies will need to have in place a system operated by a third party which allows anonymous and non-anonymous reporting. (And, well, we can only speculate whether companies going public by SPAC may be in greater need of misconduct reporting services vs companies that choose to take a more traditional and scrutinized route to market… )
“Just a few years back I had to convince investors that this category it really is a category — and fast forward to 2021, congratulations! We have a market here. It’s a growing category and there is competition in this space,” says Meidav.
“What truly differentiates Vault is that we did not just focus on digitizing an old legacy process. We focused on leveraging technology to truly empower more misconduct to surface internally and for employees to speak up in ways that weren’t available for them before. GoTogether is truly unique as well as the things that we’re doing on the operational side for a company — such as collaboration.”
She gives an example of how a customer in the oil and gas sector configured the platform to make use of an anonymous chat feature in Vault’s app so they could provide employees with a secure direct-line to company leadership.
“They’ve utilizing the anonymous chat that the app enables for people to have a direct line to leadership,” she says. “That’s incredible. That is such a progress, forward looking way to be utilizing this tool.”
Meidav says Vault has around 30 customers at this stage, split between the US and EU — its core regions of focus.
And while its platform is geared towards enterprises, its early customer base includes a fair number of scale-ups — with familiar names like Lemonade, Airbnb, Kavak, G2 and OVO Energy on the list.
Scale ups may be natural customers for this sort of product given the huge pressures that can be brought to bear upon company culture as a startup switches to expanding headcount very rapidly, per Meidav.
“They are the early adopters and they are also very much sensitive to events such as these kind of [workplace] scandals as it can impact them greatly… as well as the fact that when a company goes through a hyper growth — and usually you see hyper growth happening in tech companies more than in any other type of sector — hyper growth is at time when you really, as management, as leadership, it’s really important to safeguard your culture,” she suggests.
“Because it changes very, very quickly and these changes can lead to all sorts of things — and it’s really important that leadership is on top of it. So when a company goes through hyper growth it’s an excellent time for them to incorporate a tool such as Vault. As well as the fact that every company that even thinks of an IPO in the coming months or years will do very well to put a tool like Vault in place.”
Expanding Vault’s own team is also on the cards after this Series A close, as it guns for the next phase of growth for its own business. Presumably, though, it’s not short of a misconduct reporting solution.
Microsoft today announced PyTorch Enterprise, a new Azure service that provides developers with additional support when using PyTorch on Azure. It’s basically Microsoft’s commercial support offering for PyTorch
PyTorch is a Python-centric open-source machine learning framework with a focus on computer vision and natural language processing. It was originally developed by Facebook and is, at least to some degree, comparable to Google’s popular TensorFlow framework.
Frank X. Shaw, Microsoft’s corporate VP for communications, described the new PyTorch Enterprise service as providing developers with “a more reliable production experience for organizations using PyTorch in their data sciences work.”
With PyTorch Enterprise, members of Microsoft’s Premier and Unified support program will get benefits like prioritized requests, hands-on support and solutions for hotfixes, bugs and security patches, Shaw explained. Every year, Microsoft will also select one PyTorch support for long-term support.
Azure already made it relatively easy to use PyTorch and Microsoft has long invested in the library by, for example, taking over the development of PyTorch for Windows last year. As Microsoft noted in today’s announcement, the latest release of PyTorch will be integrated with Azure Machine Learning and the company promises to feed back the PyTorch code it developers back to the public PyTorch distribution.
Enterprise support will be available for PyTorch version 1.8.1 and up on Windows 10 and a number of popular Linux distributions.
“This new enterprise-level offering by Microsoft closes an important gap. PyTorch gives our researchers unprecedented flexibility in designing their models and running their experiments,” said Jeremy Jancsary, Senior Principal Research Scientist at Nuance. “Serving these models in production, however, can be a challenge. The direct involvement of Microsoft lets us deploy new versions of PyTorch to Azure with confidence.”
With this new offering, Microsoft is taking a page out of the open-source monetization playbook for startups by offering additional services on top of an open-source project. Since PyTorch wasn’t developed by a startup, only to have a major cloud provider then offer its own commercial version on top of the open-source code, this feels like a rather uncontroversial move.
The Fireflies.ai project is a good reminder that not every startup project goes from idea to unicorn-status in 48 minutes. Instead, the startup’s CEO Krish Ramineni told TechCrunch about how a period of interest in natural language processing (NLP), tinkering with a friend, a stint at Microsoft, and even working on Slack bots led him to helping found Fireflies.ai (Fireflies), a company that today announced a $14 million raise led by Khosla.
Fireflies is a two-part service. Its first point of business is recording and transcribing voice conversations. Things like video meetings, for example. Next, Fireflies wants to plug your voice data into other applications, helping its customers automate data entry, task creation and more.
Before today’s round, the startup had raised around $5 million, including some micro-rounds, a stint in the Acceleprise accelerator, and a $4.9 million seed round raised in late 2019. That investment included participation from Canaan Partners and well-known angel April Underwood.
That Fireflies has raised more capital is not surprising, given how quickly it has accreted users. According to an interview with Ramineni, more than 10,000 teams use Fireflies today. In individual usage terms, some 35,000 organizations are represented amongst its user base.
As the company launched its product in early 2020, those results sound pretty good.
But TechCrunch was curious if revenue tracked with usage at Fireflies, as is sometimes the case. It does, Ramineni said, adding that his company grew its revenues 300% in the last six or seven months.
How did it manage such rapid growth while only having raised $5 million before, and with a team that is around 90% in its product and engineering teams? By pursuing everyone’s favorite: the bottoms-up sales model. In short, you can use Fireflies for free, but if you run out of meeting credits, other usage-based blockers or the need for different, paywalled functionality, you have to cough up for the product.
Folks are, it appears.
Fireflies is in fact an interesting hybrid of SaaS and usage-based pricing. The higher the paid tier that a user selects, the more minutes of transcription they are apportioned per month. But there are caps, limits that users can buy their way out of. TechCrunch asked Ramineni about it, with the CEO explaining that some customers want to ingest years of saved meetings. Our read is that despite work done by the startup to keep its infrastructure costs low, building pricing guardrails around product usage just makes sense for the startup.
The company will sport SaaS-like gross margins, Ramineni confirmed to TechCrunch.
Looking ahead, Fireflies wants to plug into more and more meeting platforms, and external software. You can currently link your Fireflies account to services like Zapier, Slack and your CRM. Over time, it’s not hard to see how the startup could take more direct commands from meetings, and help users better distribute, file and recall meeting information.
As someone with too many meetings, and too many notes documents spread out across the wasteland that is my Google Drive account, I get why people are using Fireflies today. But if the startup can build a no-code automation platform on top of my note taking? Then I will probably have to buy its service.
Speaking of which, as a final note, working for a Major American Corporation can have its downsides. For example, Ramineni provided TechCrunch with a recording of our interview inside of Fireflies. This was nice, as I prefer to write from both my notes and transcripts to ensure that I am not missing things, or making mistakes. Fireflies kept asking me to log in. I tried with my corporate Google account. Which blocks such log-ins. So I kept getting the same prompt again and again.
Annoying? Sure. Lethal? No.
More when we can squeeze more growth data out of the startup.
As far as AI systems have come in their ability to recognize what you’re saying and respond, they’re still very easily confused unless you speak carefully and literally. Google has been working on a new language model called LaMDA that’s much better at following conversations in a natural way, rather than as a series of badly formed search queries.
LaMDA is meant to be able to converse normally about just about anything without any kind of prior training. This was demonstrated in a pair of rather bizarre conversations with an AI first pretending to be Pluto and then a paper airplane.
While the utility of having a machine learning model that can pretend to be a planet (or dwarf planet, a term it clearly resents) is somewhat limited, the point of the demonstration was to show that LaMDA could carry on a conversation naturally even on this random topic, and in the arbitrary fashion of the first person.
The advance here is basically preventing the AI system from being led off track and losing the thread when attempting to respond to a series of loosely associated questions.
Normal conversations between humans jump between topics and call back to earlier ideas constantly, a practice that confuses language models to no end. But LaMDA can at least hold its own and not crash out with a “Sorry, I don’t understand” or a non-sequitur answer.
While most people are unlikely to want to have a full, natural conversation with their phones, there are plenty of situations where this sort of thing makes perfect sense. Groups like kids and older folks who don’t know or don’t care about the formalized language we use to speak to AI assistants will be able to interact more naturally with technology, for instance. And identity will be important if this sort of conversational intelligence is built into a car or appliance. No one wants to ask “Google” how much milk is left in the fridge, but they might ask “Whirly” or “Fridgadore,” the refrigerator speaking for itself.
Even CEO Sundar Pichai seemed unsure as to what exactly this new conversational AI would be used for, and emphasized that it’s still a work in development. But you can probably expect Google’s AIs to be a little more natural in their interactions going forward. And you can finally have that long, philosophical conversation with a random item you’ve always wanted.
Which disease results in the highest total economic burden per annum? If you guessed diabetes, cancer, heart disease or even obesity, you guessed wrong. Reaching a mammoth financial burden of $966 billion in 2019, the cost of rare diseases far outpaced diabetes ($327 billion), cancer ($174 billion), heart disease ($214 billion) and other chronic diseases.
Cognitive intelligence, or cognitive computing solutions, blend artificial intelligence technologies like neural networks, machine learning, and natural language processing, and are able to mimic human intelligence.
It’s not surprising that rare diseases didn’t come to mind. By definition, a rare disease affects fewer than 200,000 people. However, collectively, there are thousands of rare diseases and those affect around 400 million people worldwide. About half of rare disease patients are children, and the typical patient, young or old, weather a diagnostic odyssey lasting five years or more during which they undergo countless tests and see numerous specialists before ultimately receiving a diagnosis.
No longer a moonshot challenge
Shortening that diagnostic odyssey and reducing the associated costs was, until recently, a moonshot challenge, but is now within reach. About 80% of rare diseases are genetic, and technology and AI advances are combining to make genetic testing widely accessible.
Whole-genome sequencing, an advanced genetic test that allows us to examine the entire human DNA, now costs under $1,000, and market leader Illumina is targeting a $100 genome in the near future.
The remaining challenge is interpreting that data in the context of human health, which is not a trivial challenge. The typical human contains 5 million unique genetic variants and of those we need to identify a single disease-causing variant. Recent advances in cognitive AI allow us to interrogate a person’s whole genome sequence and identify disease-causing mechanisms automatically, augmenting human capacity.
A shift from narrow to cognitive AI
The path to a broadly usable AI solution required a paradigm shift from narrow to broader machine learning models. Scientists interpreting genomic data review thousands of data points, collected from different sources, in different formats.
An analysis of a human genome can take as long as eight hours, and there are only a few thousand qualified scientists worldwide. When we reach the $100 genome, analysts are expecting 50 million-60 million people will have their DNA sequenced every year. How will we analyze the data generated in the context of their health? That’s where cognitive intelligence comes in.
More money for the edtech boom: Munich-based StudySmarter, which makes digital tools to help learners of all ages swat up — styling itself as a ‘lifelong learning platform’ — has closed a $15 million Series A.
The round is led by sector-focused VC fund, Owl Ventures. New York-based Left Lane Capital is co-investing, along with Lars Fjeldsoe-Nielsen (ex WhatsApp, Uber and Dropbox; now GP at Balderton Capital), and existing early stage investor Dieter von Holtzbrinck Ventures (aka DvH Ventures).
The platform, which launched back in 2018 and has amassed a user-base of 1.5M+ learners — with a 50/50 split between higher education students and K12 learners, and with main markets so far in German speaking DACH countries in Europe — uses AI technologies like natural language processing (NLP) to automate the creation of text-based interactive custom courses and track learners’ progress (including by creating a personalized study plan that adjusts as they go along).
StudySmarter claims its data shows that 94% of learners achieve better grades as a result of using its platform.
While NLP is generally most advanced for the English language, the startup says it’s confident its NLP models can be transferred to new languages without requiring new training data — claiming its tech is “scalable in any language”. (Although it concedes its algorithms increase in accuracy for a given language as users upload more content so the software itself is undertaking a learning journey and will necessarily be at a different point on the learning curve depending on the source content.)
Here’s how StudySmarter works: Users input their study goals to get recommendations for relevant revision content that’s been made available to the platform’s community.
They can also contribute content themselves to create custom courses by uploading assets like lecture slides and revisions notes. StudySmarter’s platform can then turn this source material into interactive study aids — like flashcards and revision exercises — and the startup touts the convenience of the approach, saying it enables students to manage all their revision in one place (rather than wrangling multiple learning apps).
In short, it’s both a (revision) content marketplace and a productivity platform for learning — as it helps users create their own study (or lesson) plans, and offers them handy tools like a digital magic marker that automatically turns highlighted text into flashcards, while the resulting “smart” flashcards also apply the principle of spaced repetition learning to help make the studied content stick.
Users can choose to share content they create with other learners in the StudySmarter community (or not). The startup says a quarter (25%) of its users are creators, and that 80% of the content they create is shared. Overall, it says its platform provides access to more than 25 million pieces of shared content currently.
It’s topic agnostic, as you’d expect, so course content covers a diverse range of subjects. We’re told the most popular courses to study are: Economics, Medicine, Law, Computer Science, Engineering and school subjects such as Maths, Physics, Biology and English.
Regardless of how learners use it, the platform uses AI to nudge users towards relevant revision content and topics (and study groups) to keep extending and supporting their learning process — making adaptive, ongoing recommendations for other stuff they should check out.
“The ease of creating learning materials on the StudySmarter platform results in a democratization of high-quality educational content, driven by learners themselves,” is the claim.
As well as user generated content (UGC), StudySmarter’s platform hosts content created by verified educationists and publishers — and there’s an option for users to search only for such verified content, i.e. if they don’t want to dip into the UGC pool.
“In general, there is no single workflow,” says co-founder and CMO Maurice Khudhir. “We created StudySmarter to adapt to different learner types. Some are very active learners and prefer to create content, some only want to search and consume content from other peers/publishers.”
“Our platform focuses on the art of learning itself, rather than being bound by topics, sectors, industries or content types. This means that anyone, regardless of what they’re learning, can use StudySmarter to improve how they learn. We started in higher education as it was the closest, most relevant market to where we were at the time of launch. We more recently expanded to K12, and are currently running our first corporate learning pilot.”
Gamification is a key strategy to encourage engagement and advance learning, with the platform dishing out encouraging words and emoji, plus rewards like badges and achievements based on the individual’s progress. Think of it as akin to Duolingo-style microlearning — but where users get to choose the subject (not just the language) and can feed in source material if they wish.
StudySmarter says it’s taken inspiration from tech darlings like Netflix and Tinder — baking in recommendation algorithms to surface relevant study content for users -(a la Netflix’s ‘watch next’ suggestions), and deploying a Tinder-swipe-style learning UI on mobile so that its “smart flashcards” can to adapt to users’ responses.
“Firstly, we individualise the learning experience by recommending appropriate content to the learner, depending on their demographics, demands and study goals,” explains Khudhir. “For instance, when an economics student uploads a PDF on the topic of marginal cost, StudySmarter will recommend several user-generated courses that cover marginal cost and/or several flashcards on marginal cost as well as e-books on StudySmarter that cover this topic.
“In this way, StudySmarter is similar to Netflix — Netflix will suggest similar TV shows and films depending on what you’ve already watched and StudySmarter will recommend different learning materials depending on the types of content and topics you interact with.
“As well, depending on how the student likes to learn, we also individualise the learning journey through things such as the smart flashcard learning algorithm. This is based on spaced repetition. For example, if a student is testing themselves on microeconomics, the flashcard set will go through different questions and responses and the student can swipe through the flashcards, in a similar way to Tinder. The flashcards’ sequence will adapt after every response.
“The notifications are also personalised — so they will remind the student to learn at particular points in the day, adapted to how the student uses the app.”
There’s also a scan functionality which uses OCR (optical character recognition) technology that lets users upload (paper-based) notes, handouts or books — and a sketch feature lets them carry out further edits, if they want to add more notes and scribbles.
Once ingested into the platform, this scanned (paper-based) content can of course also be used to create digital learning materials — extending the utility of the source material by plugging it into the platform’s creation and tracking capabilities.
“A significant cohort of users access StudySmarter on tablets, and they find this learning flow very useful, especially for our school-age pupils,” he adds.
StudySmarter can also offer educators and publishers detailed learning analytics, per Khudhir — who says its overarching goal is to establish itself as “the leading marketplace for educational content”, i.e. by using the information it gleans on users’ learning goals to directly recommend (relevant) professional content — “making it an extremely effective distribution platform”, as he puts it.
In addition to students, he says the platform is being used by teachers, professors, trainers, and corporate members — ie. to create content to share with their own students, team members, course participants etc, or just to publish publicly. And he notes a bit of a usage spike from teachers in March last year as the pandemic shut down schools in Europe.
What about copyright? Khudir says they follow a three-layered system to minimize infringement risks — firstly by not letting users share or export any professional content hosted on the platform.
Uploaded documents like lecture notes and users’ own comments can be shared within one university course/class in a private learning group. But only UGC (like flashcards, summaries and exercises) can be shared freely with the entire StudySmarter community, if the user wants to.
“It’s important to note that no content is shared without the author’s permission,” he notes. “We also have a contact email for people to raise potential copyright infringements. Thanks to this system, we can say that we never had a single copyright issue with universities, professors or publishers.”
Another potential pitfall around UGC is quality. And, clearly, no student wants to waste their time revising from poor (or just plain wrong) revision notes.
StudySmarter says it’s limiting that risk by tracking how learners engage with shared content on the platform — in order to create quality scores for UGC — monitoring factors like how often such stuff is used for learning; how often the students who study from it answer questions correctly; and by looking the average learning time for a particular flashcard or summary, etc.
“We combine this with an active feedback system from the students to assign each piece of content a dynamic quality score. The higher the score is, the more often it is shown to new users. If the score falls below a certain threshold, the content is removed and is only visible to the original creator,” he goes on, adding: “We track the quality of shared content on the creator level so users who consistently share low-quality content can be banned from sharing more content on the platform.”
There are unlikely to be quality issues with verified educator/publisher content. But since it’s professional content, StudySmarter can’t expect to get it purely for free — so it says it “mostly” follows revenue-sharing agreements with these types of contributors.
It is also sharing data on learning trends and to help publishers reach relevant learners, as mentioned above. So the information it can provide education publishers about potential customers is probably the bigger carrot for pulling them in.
“We are very happy to say that the vast majority of our content is not created or shared on StudySmarter for any financial incentive but rather because our platform and technology simply make the creation significantly easier,” says Khudir, adding: “We have not paid a single Euro to any user on StudySmarter to create content and do not intend to do so going forward.”
It’s still early days for monetization, which he says isn’t front of mind yet — with the team focused on building out the platform’s global reach — but he notes that the model allows for a number of b2b revenue streams, adding that they’ve been doing some early b2b monetization by working with employers and businesses to promote their graduate programs or to support recruitment drives.
The new funding will be put towards product development and supporting the platform’s global expansion, per Khudir.
“We’ve run successful pilots in the U.K. and U.S. so they’re our primary focus to expand to by Q3 this year. In fact, following a test pilot in the U.K. in December, we became the number one education app within 24 hours (ahead of the likes of Duolingo, Quizlet, Kahoot, and Photomath), which bodes well!” he goes on.
“Brazil, India and Indonesia are key targets for us due to a wider need for digital education. We’re also looking to launch in France, Nordics, Spain, Russia and many more countries. Due to the fact our platform is content-agnostic, and the technology that underpins it is universal, we’re able to scale effectively in multiple countries and languages. Within the next 12 months, we will be expanding to more than 12 countries and support millions of learners globally.”
StudySmarter’s subject-agnostic, feature-packed, one-stop-shop platform approach sets it apart from what Khudir refers to as “single-feature apps”, i.e. which just help you learn one thing — be that Duolingo (only languages), or apps that focus on teaching a particular skill-set (like Photomath for maths equations, or dedicated learn-to-code apps/courses (and toys)).
But where the process of learning is concerned, there are lots of ways of going about it, and no one that suits everyone (or every subject), so there’s undoubtedly room for (and value in) a variety of approaches (which may happily operate in parallel). So it seems a safe bet that broad-brush learning platforms aren’t going to replace specialized tools — or (indeed) vice versa.
StudySmarter names the likes of Course Hero, StuDocu, Quizlet and Anki as taking a similar broad approach — while simultaneously claiming they’re not doing it in “quite the same, holistic, end-to-end, all-in-one bespoke platform for learners” way.
Albeit, some of those edtech rivals are doing it with a lot more capital already raised. So StudySmarter is going to need to work smart and hard to localize and grab students’ attention as it guns for growth far beyond its European base.
Among the many, many tasks required of grade school teachers is that of gauging each student’s reading level, usually by a time-consuming and high-pressure one-on-one examination. Microsoft’s new Reading Progress application takes some of the load off the teacher’s shoulders, allowing kids to do their reading at home and using natural language understanding to help highlight obstacles and progress.
The last year threw most educational plans into disarray, and reading levels did not advance the way they would have if kids were in school. Companies like Amira are emerging to fill the gap with AI-monitored reading, and Microsoft aims to provide teachers with more tools on their side.
Reading Progress is an add-on for Microsoft Teams that helps teachers administer reading tests in a more flexible way, taking pressure off students who might stumble in a command performance, and identifying and tracking important reading events like skipped words and self-corrections.
Teachers pick reading assignments for each students (or the whole class) to read, and the kids do so on their own time, more like doing homework than taking a test. They record a video directly in the app, the audio of which is analyzed by algorithms watching for the usual stumbles.
As you can see in this video testimony by 4th grader Brielle, this may be preferable to many kids:
If a bright and confident kid like Brielle feels better doing it this way (and is now reading two years ahead of her grade, nice work Brielle!), what about the kids who are having trouble reading due to dyslexia, or are worried about their accent, or are simply shy? Being able to just talk to their own camera, by themselves in their own home, could make for a much better reading — and therefore a more accurate assessment.
It’s not meant to replace the teacher altogether, of course — it’s a tool that allows overloaded educators to prioritize and focus better and track things more objectively. It’s similar to how Amira is not meant to replace in-person reading groups — impossible during the pandemic — but provides a similarly helpful process of quickly correcting common mistakes and encouraging the reader.
Microsoft published about half a dozen things pertaining to Reading Progress today. Here’s its origin story, a basic summary, its product hub, a walkthrough video, and citations supporting its approach. There’s more, too, in this omnibus post about new education-related products out soon or now.
While visual ‘no code‘ tools are helping businesses get more out of computing without the need for armies of in-house techies to configure software on behalf of other staff, access to the most powerful tech tools — at the ‘deep tech’ AI coal face — still requires some expert help (and/or costly in-house expertise).
This is where bootstrapping French startup, NLPCloud.io, is plying a trade in MLOps/AIOps — or ‘compute platform as a service’ (being as it runs the queries on its own servers) — with a focus on natural language processing (NLP), as its name suggests.
Developments in artificial intelligence have, in recent years, led to impressive advances in the field of NLP — a technology that can help businesses scale their capacity to intelligently grapple with all sorts of communications by automating tasks like Named Entity Recognition, sentiment-analysis, text classification, summarization, question answering, and Part-Of-Speech tagging, freeing up (human) staff to focus on more complex/nuanced work. (Although it’s worth emphasizing that the bulk of NLP research has focused on the English language — meaning that’s where this tech is most mature; so associated AI advances are not universally distributed.)
Production ready (pre-trained) NLP models for English are readily available ‘out of the box’. There are also dedicated open source frameworks offering help with training models. But businesses wanting to tap into NLP still need to have the DevOps resource and chops to implement NLP models.
NLPCloud.io is catering to businesses that don’t feel up to the implementation challenge themselves — offering “production-ready NLP API” with the promise of “no DevOps required”.
Its API is based on Hugging Face and spaCy open-source models. Customers can either choose to use ready-to-use pre-trained models (it selects the “best” open source models; it does not build its own); or they can upload custom models developed internally by their own data scientists — which it says is a point of differentiation vs SaaS services such as Google Natural Language (which uses Google’s ML models) or Amazon Comprehend and Monkey Learn.
NLPCloud.io says it wants to democratize NLP by helping developers and data scientists deliver these projects “in no time and at a fair price”. (It has a tiered pricing model based on requests per minute, which starts at $39pm and ranges up to $1,199pm, at the enterprise end, for one custom model running on a GPU. It does also offer a free tier so users can test models at low request velocity without incurring a charge.)
“The idea came from the fact that, as a software engineer, I saw many AI projects fail because of the deployment to production phase,” says sole founder and CTO Julien Salinas. “Companies often focus on building accurate and fast AI models but today more and more excellent open-source models are available and are doing an excellent job… so the toughest challenge now is being able to efficiently use these models in production. It takes AI skills, DevOps skills, programming skill… which is why it’s a challenge for so many companies, and which is why I decided to launch NLPCloud.io.”
The platform launched in January 2021 and now has around 500 users, including 30 who are paying for the service. While the startup, which is based in Grenoble, in the French Alps, is a team of three for now, plus a couple of independent contractors. (Salinas says he plans to hire five people by the end of the year.)
“Most of our users are tech startups but we also start having a couple of bigger companies,” he tells TechCrunch. “The biggest demand I’m seeing is both from software engineers and data scientists. Sometimes it’s from teams who have data science skills but don’t have DevOps skills (or don’t want to spend time on this). Sometimes it’s from tech teams who want to leverage NLP out-of-the-box without hiring a whole data science team.”
“We have very diverse customers, from solo startup founders to bigger companies like BBVA, Mintel, Senuto… in all sorts of sectors (banking, public relations, market research),” he adds.
Use cases of its customers include lead generation from unstructured text (such as web pages), via named entities extraction; and sorting support tickets based on urgency by conducting sentiment analysis.
Content marketers are also using its platform for headline generation (via summarization). While text classification capabilities are being used for economic intelligence and financial data extraction, per Salinas.
He says his own experience as a CTO and software engineer working on NLP projects at a number of tech companies led him to spot an opportunity in the challenge of AI implementation.
“I realized that it was quite easy to build acceptable NLP models thanks to great open-source frameworks like spaCy and Hugging Face Transformers but then I found it quite hard to use these models in production,” he explains. “It takes programming skills in order to develop an API, strong DevOps skills in order to build a robust and fast infrastructure to serve NLP models (AI models in general consume a lot of resources), and also data science skills of course.
“I tried to look for ready-to-use cloud solutions in order to save weeks of work but I couldn’t find anything satisfactory. My intuition was that such a platform would help tech teams save a lot of time, sometimes months of work for the teams who don’t have strong DevOps profiles.”
“NLP has been around for decades but until recently it took whole teams of data scientists to build acceptable NLP models. For a couple of years, we’ve made amazing progress in terms of accuracy and speed of the NLP models. More and more experts who have been working in the NLP field for decades agree that NLP is becoming a ‘commodity’,” he goes on. “Frameworks like spaCy make it extremely simple for developers to leverage NLP models without having advanced data science knowledge. And Hugging Face’s open-source repository for NLP models is also a great step in this direction.
“But having these models run in production is still hard, and maybe even harder than before as these brand new models are very demanding in terms of resources.”
The models NLPCloud.io offers are picked for performance — where “best” means it has “the best compromise between accuracy and speed”. Salinas also says they are paying mind to context, given NLP can be used for diverse user cases — hence proposing number of models so as to be able to adapt to a given use.
“Initially we started with models dedicated to entities extraction only but most of our first customers also asked for other use cases too, so we started adding other models,” he notes, adding that they will continue to add more models from the two chosen frameworks — “in order to cover more use cases, and more languages”.
SpaCy and Hugging Face, meanwhile, were chosen to be the source for the models offered via its API based on their track record as companies, the NLP libraries they offer and their focus on production-ready framework — with the combination allowing NLPCloud.io to offer a selection of models that are fast and accurate, working within the bounds of respective trade-offs, according to Salinas.
“SpaCy is developed by a solid company in Germany called Explosion.ai. This library has become one of the most used NLP libraries among companies who want to leverage NLP in production ‘for real’ (as opposed to academic research only). The reason is that it is very fast, has great accuracy in most scenarios, and is an opinionated” framework which makes it very simple to use by non-data scientists (the tradeoff is that it gives less customization possibilities),” he says.
“Hugging Face is an even more solid company that recently raised $40M for a good reason: They created a disruptive NLP library called ‘transformers’ that improves a lot the accuracy of NLP models (the tradeoff is that it is very resource intensive though). It gives the opportunity to cover more use cases like sentiment analysis, classification, summarization… In addition to that, they created an open-source repository where it is easy to select the best model you need for your use case.”
While AI is advancing at a clip within certain tracks — such as NLP for English — there are still caveats and potential pitfalls attached to automating language processing and analysis, with the risk of getting stuff wrong or worse. AI models trained on human-generated data have, for example, been shown reflecting embedded biases and prejudices of the people who produced the underlying data.
Salinas agrees NLP can sometimes face “concerning bias issues”, such as racism and misogyny. But he expresses confidence in the models they’ve selected.
“Most of the time it seems [bias in NLP] is due to the underlying data used to trained the models. It shows we should be more careful about the origin of this data,” he says. “In my opinion the best solution in order to mitigate this is that the community of NLP users should actively report something inappropriate when using a specific model so that this model can be paused and fixed.”
“Even if we doubt that such a bias exists in the models we’re proposing, we do encourage our users to report such problems to us so we can take measures,” he adds.
By now you’ve probably heard of ESG (Environmental, Social, Governance) ratings for companies, or ratings for their carbon footprint. Well, now a UK company has come up with a way of rating the ‘ethics’ social media companies.
EthicsGrade is an ESG ratings agency, focusing on AI governance. Headed up Charles Radclyffe, the former head of AI at Fidelity, it uses AI-driven models to create a more complete picture of the ESG of organizations, harnessing Natural Language Processing to automate the analysis of huge data sets. This includes tracking controversial topics, and public statements.
Frustrated with the green-washing of some ‘environmental’ stocks, Radclyffe realized that the AI governance of social media companies was not being properly considered, despite presenting an enormous risk to investors in the wake of such scandals as the manipulation of Facebook by companies such as Cambridge Analytica during the US Election and the UK’s Brexit referendum.
The idea is that these ratings are used by companies to better see where they should improve. But the twist is that asset managers can also see where the risks of AI might lie.
Speaking to TechCrunch he said: “While at Fidelity I got a reputation within the firm for being the go-to person, for my colleagues in the investment team, who wanted to understand the risks within the technology firms that we were investing in. After being asked a number of times about some dodgy facial recognition company or a social media platform, I realized there was actually a massive absence of data around this stuff as opposed to anecdotal evidence.”
He says that when he left Fidelity he decided EthicsGrade would out to cover not just ESGs but also AI ethics for platforms that are driven by algorithms.
He told me: “We’ve built a model to analyze technology governance. We’ve covered 20 industries. So most of what we’ve published so far has been non-tech companies because these are risks that are inherent in many other industries, other than simply social media or big tech. But over the next couple of weeks, we’re going live with our data on things which are directly related to tech, starting with social media.”
Essentially, what they are doing is a big parallel with what is being done in the ESG space.
“The question we want to be able to answer is how does Tik Tok compare against Twitter or Wechat as against WhatsApp. And what we’ve essentially found is that things like GDPR have done a lot of good in terms of raising the bar on questions like data privacy and data governance. But in a lot of the other areas that we cover, such as ethical risk or a firm’s approach to public policy, are indeed technical questions about risk management,” says Radclyffe.
But, of course, they are effectively rating algorithms. Are the ratings they are giving the social platforms themselves derived from algorithms? EthicsGrade says they are training their own AI through NLP as they go so that they can automate what is currently very human analysts centric, just as ‘sustainalytics’ et al did years ago in the environmental arena.
So how are they coming up with these ratings? EthicsGrade says are evaluating “the extent to which organizations implement transparent and democratic values, ensure informed consent and risk management protocols, and establish a positive environment for error and improvement.” And this is all achieved, they say, all through publicly available data – policy, website, lobbying etc. In simple terms, they rate the governance of the AI not necessarily the algorithms themselves but what checks and balances are in place to ensure that the outcomes and inputs are ethical and managed.
“Our goal really is to target asset owners and asset managers,” says Radclyffe. “So if you look at any of these firms like, let’s say Twitter, 29% of Twitter is owned by five organizations: it’s Vanguard, Morgan Stanley, Blackrock, State Street and ClearBridge. If you look at the ownership structure of Facebook or Microsoft, it’s the same firms: Fidelity, Vanguard and BlackRock. And so really we only need to win a couple of hearts and minds, we just need to convince the asset owners and the asset managers that questions like the ones journalists have been asking for years are pertinent and relevant to their portfolios and that’s really how we’re planning to make our impact.”
Asked if they look at content of things like Tweets, he said no: “We don’t look at content. What we concern ourselves is how they govern their technology, and where we can find evidence of that. So what we do is we write to each firm with our rating, with our assessment of them. We make it very clear that it’s based on publicly available data. And then we invite them to complete a survey. Essentially, that survey helps us validate data of these firms. Microsoft is the only one that’s completed the survey.”
Ideally, firms will “verify the information, that they’ve got a particular process in place to make sure that things are well-managed and their algorithms don’t become discriminatory.”
In an age increasingly driven by algorithms, it will be interesting to see if this idea of rating them for risk takes off, especially amongst asset managers.
Match Group, the parent company to Tinder, Match, OkCupid, Hinge and other top dating apps, announced this morning it’s made a seven-figure investment into nonprofit background check platform Garbo, with the goal of helping Match Group’s users make more informed decisions about their safety when dating online. The deal will see Match working closing with Garbo to integrate the background check technology into Tinder later this year, followed by other Match Group U.S. dating apps.
New York-based Garbo was originally founded in 2018 by Kathryn Kosmides, a survivor of gender-based violence who wanted to make it easier for everyone to be able to have the ability to look up critical information about someone’s background that could indicate a history of violence.
Typically, background check services are run by for-profit companies and surface a wide variety of personal information — like drug offenses or minor traffic violations — that aren’t always relevant to matters of safety and abuse. Plus, those types of charges are often levied against members of more vulnerable communities, Garbo has pointed out, and aren’t correlated to gender-based violence.
Garbo instead offers low-cost background checks by collecting public records and reports of violence and abuse only, including arrests, convictions, restraining orders, harassment, and other violent crimes. To use the service, a user would enter either a first and last name, or a first name and phone number — often the only information a dating app users will have on one of their matches.
The service will then perform what it calls an “equitable background” check, meaning it will exclude drug possession charges form its results, as well as traffic tickets besides DUIs and vehicle manslaughter.
Last year, the nonprofit launched a beta test of its technology with 500 people in the NYC area, and soon grew its waitlist to over 6,000 more entirely by word-of-mouth. Garbo later pulled the test as the team realized the technology had the potential for national scale — something the team wanted to deliver before launching to the public.
As a small nonprofit without much in terms of financial backing, Garbo also realized a larger partner may be needed in that effort. After Kosmides was connected with Match Group’s new head of safety, Tracey Breeden, the two companies agreed to work together on bringing the technology to a broader U.S. audience.
“For far too long women and marginalized groups in all corners of the world have faced many barriers to resources and safety,” said Breeden, Match Group’s Head of Safety and Social Advocacy, in a statement about today’s news. “We recognize corporations can play a key role in helping remove those barriers with technology and true collaboration rooted in action. In partnership with Match Group, Garbo’s thoughtful and groundbreaking consumer background check will enable and empower users with information, helping create equitable pathways to safer connections and online communities across tech,” she added.
This is Match Group’s second investment in an outside safety technology provider to enhance its dating apps’ feature sets. In early 2020, the company invested in Noonlight to help it power new safety features inside Tinder and other dating apps following a damning investigative report by ProPublica and Columbia Journalism Investigations, published December 2019. The report revealed how Match Group allowed known sexual predators to use its apps. It also noted that Match Group didn’t have a uniform policy of running background checks on its dating app users, putting the responsibility on users to keep themselves safe.
Meanwhile, Tinder’s top competitor Bumble has been marketing itself as a more women-friendly alternative to traditional dating apps like Tinder, and has rolled out a number of features designed to keep users safe from bad actors — including, most recently, a way to prevent to prevent them from using the app’s “unmatch” option to hide from their victims.
Given that gaining a reputation for being an “unsafe” app could be significantly damaging to a brand like Tinder and the larger online dating industry as a whole, it’s obvious why Match Group is now directly investing to address this problem. With the Noonlight investment, for example, Match Group promised features like a discreet way to trigger emergency services inside the Tinder app, similar to the feature found in Uber and Lyft, plus other anti-abuse measures.
Match Group says Garbo will use the new investment to hire across product, engineering and in leadership — including a head of engineering and an initial team of five engineers. This team will work to build out Garbo’s capabilities, using technologies like natural language processing and A.I.
Garbo will also benefit from sizable contributions of time and resources from Match Group, as its gets its product fully operational and then rolled out across Match Group products, starting with Tinder. And Match Group will help Garbo make the nonprofit’s technology accessible to other platforms, as well — like ridesharing companies.
Once live in Tinder, the background check feature may not be free, however.
Instead, Match Group says it will work to determine the pricing based on things like what user adoption looks like, how many people want to use it, how many searches they want to perform and other factors. It also hasn’t yet determined how deep the integration may be — whether, for example, it will link outside the app to Garbo or make it seem more like an in-app feature.
Match Group doesn’t have any exact time frame for the feature’s launch beyond “later this year” for Tinder, to be followed by other U.S. dating apps. The company may consider looking into similar investments for its services aimed at international users in the months to come.
Hugging Face has raised a $40 million Series B funding round — Addition is leading the round. The company has been building an open source library for natural language processing (NLP) technologies. You can find the Transformers library on GitHub — it has 42,000 stars and 10,000 forks.
Existing investors Lux Capital, A.Capital and Betaworks also participated in today’s funding round. Other investors include Dev Ittycheria, Olivier Pomel, Alex Wang, Aghi Marietti, Florian Douetteau, Richard Socher, Paul St. John, Kevin Durant and Rich Kleiman.
With Transformers, you can leverage popular NLP models, such as BERT, GPT, XLNet, T5 or DistilBERT and use those models to manipulate text in one way or another. For instance, you can classify text, extract information, automatically answer questions, summarize text, generate text, etc.
There are many different use cases for NLP. A popular one has been support chatbot. For instance, challenger bank Monzo has been using Hugging Face behind the scenes to answer questions from its customers. Overall, around 5,000 companies are using Hugging Face in one way or another, including Microsoft with its search engine Bing.
When it comes to business model, the startup has recently launched a way to get prioritized support, manage private models and host the inference API for you. Clients include Bloomberg, Typeform and Grammarly.
With the new funding round, the company plans to triple its headcount in New York and Paris — there will be remote positions too. Interestingly, the company is also sharing some details about its bank account.
Hugging Face has been cash-flow positive in January and February 2021. The company raised a $15 million round a little over a year ago — 90% of the previous round is still available on the bank account. And the company’s valuation saw a fivefold increase. This shouldn’t come as a surprise as you can negotiate better terms if you don’t actually need to raise.
And it looks like Hugging Face is on the right path as the company is hosting a vibrant community of NLP developers. You can browse models and datasets, take advantage of them and contribute as Hugging Face is becoming the central brick of NLP enthusiasts.
The introduction of GPT-3 in 2020 was a tipping point for artificial intelligence. In 2021, this technology will power the launch of a thousand new startups and applications. GPT-3 and similar models have brought the power of AI into the hands of those looking to experiment — and the results have been extraordinary.
Trained on trillions of words, GPT-3 is a 175-billion parameter transformer model — the third of such models released by OpenAI. GPT-3 is remarkable in its ability to generate human-like text and responses — in some respects, it’s eerie. When prompted by a user with text, GPT-3 can return coherent and topical emails, tweets, trivia and much more.
In 2021, this technology will power the launch of a thousand new startups and applications.
Suddenly, authoring emails, customer interactions, social media exchanges and even news stories can be automated — at least in part. While large companies are pondering the pitfalls and risks of generating text (remember Microsoft’s disastrous Tay bot?), startups have already begun sweeping in with novel applications — and they will continue to lead the charge in transformer-based innovation.
OpenAI researchers first released the paper introducing GPT-3 in May 2020, and what started out as some nifty use cases on Twitter has quickly become a hotbed of startup activity. Companies have been formed on top of GPT-3, using the model to generate emails and marketing copy, to create an interactive nutrition tracker or chatbot, and more. Let OthersideAI take a first pass at writing your emails, or try out Broca or Snazzy for your ad copy and campaign content, for instance.
Other young companies are harnessing the API to accelerate their existing efforts, augmenting their technical teams’ capabilities with the power of 175 billion parameters and quickly bringing otherwise difficult products to market with much greater speed and data than previously possible. With some clever prompt engineering (a combination of an instruction to the model with a sample output to help guide the model), these companies leverage the underlying GPT-3 system to improve or extend an existing application’s capabilities.
Sure, a text expander can be a useful tool for shorthand notation — but powered by GPT-3, that shorthand can be transformed into a product that generates contextually aware emails in your own style of writing.
As early-stage technology investors, we are inspired to see AI broadly, and natural language processing specifically, become more accessible via the next generation of large-scale transformer models like GPT-3. We expect they will unlock new use cases and capabilities we have yet to even contemplate.
Efficient and cost-effective vaccine distribution remains one of the biggest challenges of 2021, so it’s no surprise that startup Notable Health wants to use their automation platform to help. Initially started to help address the nearly $250 billion annual administrative costs in healthcare, Notable Health launched in 2017 to use automation to replace time-consuming and repetitive simple tasks in health industry admin. In early January of this year, they announced plans to use that technology as a way to help manage vaccine distribution.
“As a physician, I saw firsthand that with any patient encounter, there are 90 steps or touchpoints that need to occur,” said Notable Health medical director Muthu Alagappan in an interview. “It’s our hypothesis that the vast majority of those points can be automated.”
Notable Health’s core technology is a platform that uses robotic process automation (RPA), natural language processing (NLP), and machine learning to find eligible patients for the COVID-19 vaccine. Combined with data provided by hospital systems’ electronic health records, the platform helps those qualified to receive the vaccine set up appointments and guides them to other relevant educational resources.
“By leveraging intelligent automation to identify, outreach, educate and triage patients, health systems can develop efficient and equitable vaccine distribution workflows,” said Notable Health strategic advisor and Biden Transition COVID-19 Advisory Board Member Dr. Ezekiel Emanuel, in a press release.
Making vaccine appointments has been especially difficult for older Americans, many of whom have reportedly struggled with navigating scheduling websites. Alagappan sees that as a design problem. “Technology often gets a bad reputation, because it’s hampered by the many bad technology experiences that are out there,” he said.
Instead, he thinks Notable Health has kept the user in mind through a more simplified approach, asking users only for basic and easy-to-remember information through a text message link. “It’s that emphasis on user-centric design that I think has allowed us to still have really good engagement rates even with older populations,” he said.
While the startup’s platform will likely help hospitals and health systems develop a more efficient approach to vaccinations, its use of RPA and NLP holds promise for future optimization in healthcare. Leaders of similar technology in other industries have already gone on to have multi-billion dollar valuations, and continue to attract investors’ interest.
Artificial intelligence is expected to grow in healthcare over the next several years, but Alagappan argues that combining that with other, more readily available intelligent technologies is also an important step towards improved care. “When we say intelligent automation, we’re really referring to the marriage of two concepts: artificial intelligence—which is knowing what to do—and robotic process automation—which is knowing how to do it,” he said. That dual approach is what he says allows Notable Health to bypass administrative bottlenecks in healthcare, instructing bots to carry out those tasks in an efficient and adaptable way.
So far, Notable Health has worked with several hospital systems across multiple states in using their platform for vaccine distribution and scheduling, and are now using the platform to reach out to tens of thousands of patients per day.