Monad emerges from stealth with $17M to solve the cybersecurity big data problem

Cloud security startup Monad, which offers a platform for extracting and connecting data from various security tools, has launched from stealth with $17 million in Series A funding led by Index Ventures. 

Monad was founded on the belief that enterprise cybersecurity is a growing data management challenge, as organizations try to understand and interpret the masses of information that’s siloed within disconnected logs and databases. Once an organization has extracted data from their security tools, Monad’s Security Data Platform enables them to centralize that data within a data warehouse of choice, and normalize and enrich the data so that security teams have the insights they need to secure their systems and data effectively.

“Security is fundamentally a big data problem,” said Christian Almenar, CEO and co-founder of Monad. “Customers are often unable to access their security data in the streamlined manner that DevOps and cloud engineering teams need to build their apps quickly while also addressing their most pressing security and compliance challenges. We founded Monad to solve this security data challenge and liberate customers’ security data from siloed tools to make it accessible via any data warehouse of choice.”

The startup’s Series A funding round, which was also backed by Sequoia Capital, brings its total amount of investment raised to  $19 million and comes 12 months after its Sequoia-led seed round. The funds will enable Monad to scale its development efforts for its security data cloud platform, the startup said.

Monad was founded in May 2020 by security veterans Christian Almenar and Jacolon Walker. Almenar previously co-founded serverless security startup Intrinsic which was acquired by VMware in 2019, while Walker served as CISO and security engineer at OpenDoor, Collective Health, and Palantir.

#big-data, #cloud-computing, #cloud-infrastructure, #computer-security, #computing, #data-management, #data-warehouse, #devops, #funding, #information-technology, #intrinsic, #opendoor, #palantir, #security, #security-tools, #sequoia-capital, #serverless-computing, #technology, #vmware

Companies betting on data must value people as much as AI

The Pareto principle, also known as the 80-20 rule, asserts that 80% of consequences come from 20% of causes, rendering the remainder way less impactful.

Those working with data may have heard a different rendition of the 80-20 rule: A data scientist spends 80% of their time at work cleaning up messy data as opposed to doing actual analysis or generating insights. Imagine a 30-minute drive expanded to two-and-a-half hours by traffic jams, and you’ll get the picture.

As tempting as it may be to think of a future where there is a machine learning model for every business process, we do not need to tread that far right now.

While most data scientists spend more than 20% of their time at work on actual analysis, they still have to waste countless hours turning a trove of messy data into a tidy dataset ready for analysis. This process can include removing duplicate data, making sure all entries are formatted correctly and doing other preparatory work.

On average, this workflow stage takes up about 45% of the total time, a recent Anaconda survey found. An earlier poll by CrowdFlower put the estimate at 60%, and many other surveys cite figures in this range.

None of this is to say data preparation is not important. “Garbage in, garbage out” is a well-known rule in computer science circles, and it applies to data science, too. In the best-case scenario, the script will just return an error, warning that it cannot calculate the average spending per client, because the entry for customer #1527 is formatted as text, not as a numeral. In the worst case, the company will act on insights that have little to do with reality.

The real question to ask here is whether re-formatting the data for customer #1527 is really the best way to use the time of a well-paid expert. The average data scientist is paid between $95,000 and $120,000 per year, according to various estimates. Having the employee on such pay focus on mind-numbing, non-expert tasks is a waste both of their time and the company’s money. Besides, real-world data has a lifespan, and if a dataset for a time-sensitive project takes too long to collect and process, it can be outdated before any analysis is done.

What’s more, companies’ quests for data often include wasting the time of non-data-focused personnel, with employees asked to help fetch or produce data instead of working on their regular responsibilities. More than half of the data being collected by companies is often not used at all, suggesting that the time of everyone involved in the collection has been wasted to produce nothing but operational delay and the associated losses.

The data that has been collected, on the other hand, is often only used by a designated data science team that is too overworked to go through everything that is available.

All for data, and data for all

The issues outlined here all play into the fact that save for the data pioneers like Google and Facebook, companies are still wrapping their heads around how to re-imagine themselves for the data-driven era. Data is pulled into huge databases and data scientists are left with a lot of cleaning to do, while others, whose time was wasted on helping fetch the data, do not benefit from it too often.

The truth is, we are still early when it comes to data transformation. The success of tech giants that put data at the core of their business models set off a spark that is only starting to take off. And even though the results are mixed for now, this is a sign that companies have yet to master thinking with data.

Data holds much value, and businesses are very much aware of it, as showcased by the appetite for AI experts in non-tech companies. Companies just have to do it right, and one of the key tasks in this respect is to start focusing on people as much as we do on AIs.

Data can enhance the operations of virtually any component within the organizational structure of any business. As tempting as it may be to think of a future where there is a machine learning model for every business process, we do not need to tread that far right now. The goal for any company looking to tap data today comes down to getting it from point A to point B. Point A is the part in the workflow where data is being collected, and point B is the person who needs this data for decision-making.

Importantly, point B does not have to be a data scientist. It could be a manager trying to figure out the optimal workflow design, an engineer looking for flaws in a manufacturing process or a UI designer doing A/B testing on a specific feature. All of these people must have the data they need at hand all the time, ready to be processed for insights.

People can thrive with data just as well as models, especially if the company invests in them and makes sure to equip them with basic analysis skills. In this approach, accessibility must be the name of the game.

Skeptics may claim that big data is nothing but an overused corporate buzzword, but advanced analytics capacities can enhance the bottom line for any company as long as it comes with a clear plan and appropriate expectations. The first step is to focus on making data accessible and easy to use and not on hauling in as much data as possible.

In other words, an all-around data culture is just as important for an enterprise as the data infrastructure.

#artificial-intelligence, #big-data, #column, #computing, #data, #data-analysis, #data-management, #data-scientist, #databases, #engineer, #enterprise, #facebook, #information, #machine-learning, #opinion, #startups, #tc

Insight Partners leads $30M round into Metabase, developing enterprise business intelligence tools

Open-source business intelligence company Metabase announced Thursday a $30 million Series B round led by Insight Partners.

Existing investors Expa and NEA joined in on the round, which gives the San Francisco-based company a total of $42.5 million in funding since it was founded in 2015. Metabase previously raised $8 million in Series A funding back in 2019, led by NEA.

Metabase was developed within venture studio Expa and spun out as an easy way for people to interact with data sets, co-founder and CEO Sameer Al-Sakran told TechCrunch.

“When someone wants access to data, they may not know what to measure or how to use it, all they know is they have the data,” Al-Sakran said. “We provide a self-service access layer where they can ask a question, Metabase scans the data and they can use the results to build models, create a dashboard and even slice the data in ways they choose without having an analyst build out the database.”

He notes that not much has changed in the business intelligence realm since Tableau came out more than 15 years ago, and that computers can do more for the end user, particularly to understand what the user is going to do. Increasingly, open source is the way software and information wants to be consumed, especially for the person that just wants to pull the data themselves, he added.

George Mathew, managing director of Insight Partners, believes we are seeing the third generation of business intelligence tools emerging following centralized enterprise architectures like SAP, then self-service tools like Tableau and Looker and now companies like Metabase that can get users to discovery and insights quickly.

“The third generation is here and they are leading the charge to insights and value,” Mathew added. “In addition, the world has moved to the cloud, and BI tools need to move there, too. This generation of open source is a better and greater example of all three of those.”

To date, Metabase has been downloaded 98 million times and used by more than 30,000 companies across 200 countries. The company pursued another round of funding after building out a commercial offering, Metabase Enterprise, that is doing well, Al-Sakran said.

The new funding round enables the company to build out a sales team and continue with product development on both Metabase Enterprise and Metabase Cloud. Due to Metabase often being someone’s first business intelligence tool, he is also doubling down on resources to help educate customers on how to ask questions and learn from their data.

“Open source has changed from floppy disks to projects on the cloud, and we think end users have the right to see what they are running,” Al-Sakran said. “We are continuing to create new features and improve performance and overall experience in efforts to create the BI system of the future.

 

#artificial-intelligence, #business-intelligence, #business-software, #cloud, #cloud-computing, #cloud-infrastructure, #data-management, #enterprise, #expa, #funding, #george-mathew, #insight-partners, #metabase, #nea, #recent-funding, #sameer-al-sakran, #startups, #tc

xentral, an ERP platform for SMBs, raises $75M Series B from Tiger Global and Meritech

Enterprise Resource Planning systems have traditionally been the preserve of larger companies, but in recent years the amount of data small medium sized businesses can generate has increased to the point where even SMEs/SMBs can get into the world of ERP. And that’s especially true for online-only businesses.

At the beginning of the year we covered the $20 million Series A funding of Xentral, a German startup that develops ERP for online small businesses, but it clearly didn’t plan to stop there.

It’s now raised a $75 million Series B funding from Tiger Global and Meritech, following up from existing investors Sequoia Capital, Visionaries Club (a B2B-focused VC out of Berlin), and Freigeist.

The cash will be used to enhance product, hire staff and expand the UK operation towards a more global ERP market, which is expected to reach $32 billion by 2023.

Speaking to me over a call, Benedikt Sauter, founder and CEO of central, said: “We hook into Shopify, eBay, Amazon, Magento, WooCommerce, and also CRM systems like Pipedrive to collect the software together in one place, and try to do it all automatically in the background so that companies can really focus. Our goal is that a business owner who decides on Friday that they need a flexible ERP can implement and configure xentral over the weekend and hand it over to their team on Monday.”

The German startup covers services like order and warehouse management, packaging, fulfillment, accounting, and sales management, and, right now, the majority of its 1,000 customers are in Germany. Customers include the likes of direct-to-consumer brands like YFood, KoRo, the Nu Company and Flyeralarm.

John Curtius, Partner at Tiger Global, said: “Our diligence has uncovered a delighted customer base at xentral and a product offering that has evolved into a true mission-critical platform for ecommerce merchants globally. We are excited to partner with such product visionaries as Benedikt and Claudia as the business scales to serve customers not only in Europe but around the globe in the future.”

Xentral was Sequoia’s first investment in Europe since officially opening for business in the region this year. Sequoia backed other European startups before, including Graphcore, Klarna, Tessian, Unity, UiPath, n8n, and Evervault — but all of those deals were done from the US. Sequoia and its new partner in Europe, Luciana Lixandru, is understood to be joining Xentral’s board along with Visionaries’ Robert Lacher.

Alex Clayton, General Partner at Meritech said: “Meritech invested in NetSuite in 2008 with the vision of bringing ERP to the cloud… We believe that xentral will bring automation to hundreds of thousands SME businesses, dramatically improving multi-channel processes and data management in an ever-growing e-commerce market.”

Sauter and his co-founder Claudia Sauter (who is also his wife) built the early prototype of central originally for their first business in computer hardware sales.

#amazon, #articles, #artificial-intelligence, #berlin, #business, #business-partner, #ceo, #co-founder, #crm, #data-management, #ebay, #erp-software, #europe, #general-partner, #germany, #graphcore, #klarna, #luciana-lixandru, #magento, #meritech, #netsuite, #online-payments, #partner, #pipedrive, #sequoia-capital, #shopify, #tc, #tiger-global, #uipath, #united-kingdom, #united-states, #visionaries-club, #woocommerce, #xentral, #yfood

CockroachDB, the database that just won’t die

There is an art to engineering, and sometimes engineering can transform art. For Spencer Kimball and Peter Mattis, those two worlds collided when they created the widely successful open-source graphics program, GIMP, as college students at Berkeley.

That project was so successful that when the two joined Google in 2002, Sergey Brin and Larry Page personally stopped by to tell the new hires how much they liked it and explained how they used the program to create the first Google logo.

Cockroach Labs was started by developers and stays true to its roots to this day.

In terms of good fortune in the corporate hierarchy, when you get this type of recognition in a company such as Google, there’s only one way you can go — up. They went from rising stars to stars at Google, becoming the go-to guys on the Infrastructure Team. They could easily have looked forward to a lifetime of lucrative employment.

But Kimball, Mattis and another Google employee, Ben Darnell, wanted more — a company of their own. To realize their ambitions, they created Cockroach Labs, the business entity behind their ambitious open-source database CockroachDB. Can some of the smartest former engineers in Google’s arsenal upend the world of databases in a market spotted with the gravesites of storage dreams past? That’s what we are here to find out.

Berkeley software distribution

Mattis and Kimball were roommates at Berkeley majoring in computer science in the early-to-mid-1990s. In addition to their usual studies, they also became involved with the eXperimental Computing Facility (XCF), an organization of undergraduates who have a keen, almost obsessive interest in CS.

#amazon, #cockroach-labs, #cockroachdb, #cockroachdb-ec-1, #data-analysis, #data-management, #databases, #ec-cloud-and-enterprise-infrastructure, #ec-enterprise-applications, #ec-1, #enterprise, #larry-page, #mysql, #new-york-city, #relational-database, #saas, #snapchat, #startups

How engineers fought the CAP theorem in the global war on latency

CockroachDB was intended to be a global database from the beginning. The founders of Cockroach Labs wanted to ensure that data written in one location would be viewable immediately in another location 10,000 miles away. The use case was simple, but the work needed to make it happen was herculean.

The company is betting the farm that it can solve one of the largest challenges for web-scale applications. The approach it’s taking is clever, but it’s a bit complicated, particularly for the non-technical reader. Given its history and engineering talent, the company is in the process of pulling it off and making a big impact on the database market, making it a technology well worth understanding. In short, there’s value in digging into the details.

Using CockroachDB’s multiregion feature to segment data according to geographic proximity fulfills Cockroach Labs’ primary directive: To get data as close to the user as possible.

In part 1 of this EC-1, I provided a general overview and a look at the origins of Cockroach Labs. In this installment, I’m going to cover the technical details of the technology with an eye to the non-technical reader. I’m going to describe the CockroachDB technology through three questions:

  1. What makes reading and writing data over a global geography so hard?
  2. How does CockroachDB address the problem?
  3. What does it all mean for those using CockroachDB?

What makes reading and writing data over a global geography so hard?

Spencer Kimball, CEO and co-founder of Cockroach Labs, describes the situation this way:

There’s lots of other stuff you need to consider when building global applications, particularly around data management. Take, for example, the question and answer website Quora. Let’s say you live in Australia. You have an account and you store the particulars of your Quora user identity on a database partition in Australia.

But when you post a question, you actually don’t want that data to just be posted in Australia. You want that data to be posted everywhere so that all the answers to all the questions are the same for everybody, anywhere. You don’t want to have a situation where you answer a question in Sydney and then you can see it in Hong Kong, but you can’t see it in the EU. When that’s the case, you end up getting different answers depending where you are. That’s a huge problem.

Reading and writing data over a global geography is challenging for pretty much the same reason that it’s faster to get a pizza delivered from across the street than from across the city. The essential constraints of time and space apply. Whether it’s digital data or a pepperoni pizza, the further away you are from the source, the longer stuff takes to get to you.

#cockroach-labs, #cockroachdb, #cockroachdb-ec-1, #data-management, #database, #databases, #ec-cloud-and-enterprise-infrastructure, #ec-enterprise-applications, #ec-1, #enterprise, #relational-database, #saas, #startups

Scaling CockroachDB in the red ocean of relational databases

Most database startups avoid building relational databases, since that market is dominated by a few goliaths. Oracle, MySQL and Microsoft SQL Server have embedded themselves into the technical fabric of large- and medium-size companies going back decades. These established companies have a lot of market share and a lot of money to quash the competition.

So rather than trying to compete in the relational database market, over the past decade, many database startups focused on alternative architectures such as document-centric databases (like MongoDB), key-value stores (like Redis) and graph databases (like Neo4J). But Cockroach Labs went against conventional wisdom with CockroachDB: It intentionally competed in the relational database market with its relational database product.

While it did face an uphill battle to penetrate the market, Cockroach Labs saw a surprising benefit: It didn’t have to invent a market. All it needed to do was grab a share of a market that also happened to be growing rapidly.

Cockroach Labs has a bright future, compelling technology, a lot of money in the bank and has an experienced, technically astute executive team.

In previous parts of this EC-1, I looked at the origins of CockroachDB, presented an in-depth technical description of its product as well as an analysis of the company’s developer relations and cloud service, CockroachCloud. In this final installment, we’ll look at the future of the company, the competitive landscape within the relational database market, its ability to retain talent as it looks toward a potential IPO or acquisition, and the risks it faces.

CockroachDB’s success is not guaranteed. It has to overcome significant hurdles to secure a profitable place for itself among a set of well-established database technologies that are owned by companies with very deep pockets.

It’s not impossible, though. We’ll first look at MongoDB as an example of how a company can break through the barriers for database startups competing with incumbents.

When life gives you Mongos, make MongoDB

Dev Ittycheria, MongoDB CEO, rings the Nasdaq Stock Market Opening Bell. Image Credits: Nasdaq, Inc

MongoDB is a good example of the risks that come with trying to invent a new database market. The company started out as a purely document-centric database at a time when that approach was the exception rather than the rule.

Web developers like document-centric databases because they address a number of common use cases in their work. For example, a document-centric database works well for storing comments to a blog post or a customer’s entire order history and profile.

#aws, #baidu, #cloud, #cloud-computing, #cloud-services, #cockroach-labs, #cockroachdb, #cockroachdb-ec-1, #data-management, #database, #database-management, #ec-cloud-and-enterprise-infrastructure, #ec-enterprise-applications, #ec-1, #enterprise, #google, #mongodb, #mysql, #new-york-city, #nosql, #oracle, #relational-database, #saas, #startups

To guard against data loss and misuse, the cybersecurity conversation must evolve

Data breaches have become a part of life. They impact hospitals, universities, government agencies, charitable organizations and commercial enterprises. In healthcare alone, 2020 saw 640 breaches, exposing 30 million personal records, a 25% increase over 2019 that equates to roughly two breaches per day, according to the U.S. Department of Health and Human Services. On a global basis, 2.3 billion records were breached in February 2021.

It’s painfully clear that existing data loss prevention (DLP) tools are struggling to deal with the data sprawl, ubiquitous cloud services, device diversity and human behaviors that constitute our virtual world.

Conventional DLP solutions are built on a castle-and-moat framework in which data centers and cloud platforms are the castles holding sensitive data. They’re surrounded by networks, endpoint devices and human beings that serve as moats, defining the defensive security perimeters of every organization. Conventional solutions assign sensitivity ratings to individual data assets and monitor these perimeters to detect the unauthorized movement of sensitive data.

It’s painfully clear that existing data loss prevention (DLP) tools are struggling to deal with the data sprawl, ubiquitous cloud services, device diversity and human behaviors that constitute our virtual world.

Unfortunately, these historical security boundaries are becoming increasingly ambiguous and somewhat irrelevant as bots, APIs and collaboration tools become the primary conduits for sharing and exchanging data.

In reality, data loss is only half the problem confronting a modern enterprise. Corporations are routinely exposed to financial, legal and ethical risks associated with the mishandling or misuse of sensitive information within the corporation itself. The risks associated with the misuse of personally identifiable information have been widely publicized.

However, risks of similar or greater severity can result from the mishandling of intellectual property, material nonpublic information, or any type of data that was obtained through a formal agreement that placed explicit restrictions on its use.

Conventional DLP frameworks are incapable of addressing these challenges. We believe they need to be replaced by a new data misuse protection (DMP) framework that safeguards data from unauthorized or inappropriate use within a corporate environment in addition to its outright theft or inadvertent loss. DMP solutions will provide data assets with more sophisticated self-defense mechanisms instead of relying on the surveillance of traditional security perimeters.

#bridgecrew, #cloud-computing, #collaboration-tools, #column, #computer-security, #cryptography, #data-management, #dlp, #ec-column, #ec-cybersecurity, #ec-enterprise-applications, #enterprise, #security, #security-tools, #stacklet, #startups

Want in on the next $100B in cybersecurity?

As a Battery Ventures associate in 1999, I used to spend my nights highlighting actual magazines called Red Herring, InfoWorld and The Industry Standard, plus my personal favorites StorageWorld and Mass High Tech (because the other VC associates rarely scanned these).

As a 23-year-old, I’d circle the names of much older CEOs who worked at companies like IBM, EMC, Alcatel or Nortel to learn more about what they were doing. The companies were building mainframe-to-server replication technologies, IP switches and nascent web/security services on top.

Flash forward 22 years and, in a way, nothing has changed. We have gone from command line to GUI to now API as the interface innovation. But humans still need an interface, one that works for more types of people on more types of devices. We no longer talk about the OSI stack — we talk about the decentralized blockchain stack. We no longer talk about compute, data storage and analysis on a mainframe, but rather on the cloud.

The problems and opportunities have stayed quite similar, but the markets and opportunities have gotten much larger. AWS and Azure cloud businesses alone added $23 billion of run-rate revenue in the last year, growing at 32% and 50%, respectively — high growth on an already massive base.

The size of the cybersecurity market has gotten infinitely larger as software eats the world and more people are able to sit and feast at the table from anywhere on Earth (and, soon enough, space).

The size of the cybersecurity market, in particular, has gotten infinitely larger as software eats the world and more people are able to sit and feast at the table from anywhere on Earth (and, soon enough, space).

Over the course of the last few months, my colleague Spencer Calvert and I released a series of pieces about why this market opportunity is growing so rapidly: the rise of multicloud environments, data being generated and stored faster than anyone can keep up with it, SaaS applications powering virtually every function across an organization and CISOs’ rise in political power and strategic responsibility.

This all ladders up to an estimated — and we think conservative — $100 billion of new market value by 2025 alone, putting total market size at close to $280 billion.

In other words, opportunities are ripe for massive business value creation in cybersecurity. We think many unicorns will be built in these spaces, and while we are still in the early innings, there are a few specific areas where we’re looking to make bets (and one big-picture, still-developing area). Specifically, Upfront is actively looking for companies building in:

  1. Data security and data abstraction.
  2. Zero-trust, broadly applied.
  3. Supply chains.

Data security and abstraction

Data is not a new thesis, but I am excited to look at the change in data stacks from an initial cybersecurity lens. What set of opportunities can emerge if we view security at the bottom of the stack — foundational — rather than as an application at the top or to the side?

Image Credits: Upfront Ventures

For example, data is expanding faster than we can secure it. We need to first know where the (structured and unstructured) data is located, what data is being stored, confirm proper security posture and prioritize fixing the most important issues at the right speed.

Doing this at scale requires smart passive mapping, along with heuristics and rules to pull the signal from the noise in an increasingly data-rich (noisy) world. Open Raven, an Upfront portfolio company, is building a solution to discover and protect structured and unstructured data at scale across cloud environments. New large platform companies will be built in the data security space as the point of control moves from the network layer to the data layer.

We believe Open Raven is poised to be a leader in this space and also will power a new generation of “output” or application companies yet to be funded. These companies could be as big as Salesforce or Workday, built with data abstracted and managed differently from the start.

If we look at security data at the point it is created or discovered, new platforms like Open Raven may lead to the emergence of an entirely new ecosystem of apps, ranging from those Open Raven is most likely to build in-house — like compliance workflows — to entirely new companies that rebuild apps we have used since the beginning of time, which includes everything from people management systems to CRMs to product analytics to your marketing attribution tools.

Platforms that lead with a security-first, foundational lens have the potential to power a new generation of applications companies with a laser-focus on the customer engagement layer or the “output” layer, leaving the data cataloging, opinionated data models and data applications to third parties that handle data mapping, security and compliance.

Image Credits: Upfront Ventures

Put simply, if full-stack applications look like layers of the Earth, with UX as the crust, that crust can become better and deeper with foundational horizontal companies underneath meeting all the requirements surrounding personally identifiable information and GDPR, which are foisted upon companies that currently have data everywhere. This can free up time for new application companies to focus their creative talent even more deeply on the human-to-software engagement layer, building superhuman apps for every existing category.

Zero-trust

Zero-trust was first coined in 2010, but applications are still being discovered and large businesses are being built around the idea. Zero-trust, for those getting up to speed, is the assumption that anyone accessing your system, devices, etc., is a bad actor.

This could sound paranoid, but think about the last time you visited a Big Tech campus. Could you walk in past reception and security without a guest pass or name badge? Absolutely not. Same with virtual spaces and access. My first in-depth course on zero-trust security was with Fleetsmith. I invested in Fleetsmith in 2017, a young team building software to manage apps, settings and security preferences for organizations powered by Apple devices. Zero-trust in the context of Fleetsmith was about device setup and permissions. Fleetsmith was acquired by Apple in mid-2020.

About the same time as the Fleetsmith acquisition, I met Art Poghosyan and the team at Britive. This team is also deploying zero-trust for dynamic permissioning in the cloud. Britive is being built under the premise of zero-trust Just-in-time (JIT) access, whereby users are granted ephemeral access dynamically rather than the legacy process of “checking out” and “checking in” credentials.

By granting temporary privilege access instead of “always-on” credentials, Britive is able to drastically reduce cyber risks associated with over-privileged accounts, the time to manage privilege access and the workflows to streamline privileged access management across multicloud environments.

What’s next in zero-based trust (ZBT)? We see device and access as the new perimeter, as workers flex devices and locations for their work and have invested around this with Fleetsmith and now Britive. But we still think there is more ground to cover for ZBT to permeate more mundane processes. Passwords are an example of something that is, in theory, zero-trust (you must continually prove who you are). But they are woefully inadequate.

Phishing attacks to steal passwords are the most common path to data breaches. But how do you get users to adopt password managers, password rotation, dual-factor authentication or even passwordless solutions? We want to back simple, elegant solutions to instill ZBT elements into common workflows.

Supply chains

Modern software is assembled using third-party and open-source components. This assembly line of public code packages and third-party APIs is known as a supply chain. Attacks that target this assembly line are referred to as supply chain attacks.

Some supply chain attacks can be mitigated by existing application-security tools like Snyk and other SCA tools for open-source dependencies, such as Bridgecrew to automate security engineering and fix misconfigurations and Veracode for security scanning.

But other vulnerabilities can be extremely challenging to detect. Take the supply chain attack that took center stage — the SolarWinds hack of 2020 — in which a small snippet of code was altered in a SolarWinds update before spreading to 18,000 different companies, all of which relied on SolarWinds software for network monitoring or other services.

Image Credits: Upfront Ventures

How do you protect yourself from malicious code hidden in a version update of a trusted vendor that passed all of your security onboarding? How do you maintain visibility over your entire supply chain? Here we have more questions than answers, but securing supply chains is a space we will continue to explore, and we predict large companies will be built to securely vet, onboard, monitor and offboard third-party vendors, modules, APIs and other dependencies.

If you are building in any of the above spaces, or adjacent spaces, please reach out. We readily acknowledge that the cybersecurity landscape is rapidly changing, and if you agree or disagree with any of the arguments above, I want to hear from you!

#cloud, #cloud-computing, #cloud-infrastructure, #column, #cybersecurity, #data-management, #security, #software-as-a-service, #technology, #venture-capital

SkyWatch raises $17.2M for its Earth observation data platform

Waterloo-based SkyWatch was among the first startups to recognize that the key to unlocking the real benefits of the space economy lay in making Earth observation data accessible and portable, and now the company has raised a $17.2 million Series B to help it further that goal. Fresh on the heals of a partnership with Poland-based satellite operator SatRevolution, SkyWatch is now set to bridge the gulf between satellite startups and customers in a bigger way as it lowers the barriers to entry for new companies focused on high-tech spacecraft payloads.

The new round of funding was led by new investor Drive Capital, and included participation from existing investors including Bullpen Capital, Space Capital, Golden Ventures and BDC Ventures. SkyWatch CEO and co-founder James Slifierz told me that bringing Drive on was a major win for the Series B.

“Drive is a firm that has actually been researching the space industry for a few years now, and looking for an opportunity that would be their first space technology investment,” he said. “Not their first in the [GTA-Waterloo] area  they’re based out of Columbus, Ohio, made up of Silicon Valley veterans. They were a little early to the trend of believing that a majority of the really interesting and large opportunities would eventually evolve outside of the Bay Area and outside of New York City.”

SkyWatch definitely fits the bill, having built strong revenue pipeline for an Earth observation data platform that makes the information collected by the many observation satellites on orbit accessible to private industry, in a way that doesn’t require re-architecting existing systems or handling huge amounts of data in unfamiliar formats.

This fresh funding will help SkyWatch accelerate the rollout of its TerraStream product, a nw platform that the company developed to provide full-service data management, ordering, processing and delivery for satellite companies. This allows SkyWatch to not only make data collected by Earth observation satellites like those operated by SatRevolution accessible to customers — but also to source customers for those companies, too, effectively handling both sales and delivery, which many satellite startups born from a technical or academic background don’t start out equipped to tackle.

“My favorite analogy for TerraStream is it’s Shopify for space companies,” Slifierz said. “It takes away a lot of the complexity of going to market. So if you want to build an amazing shoe brand today, Shopify enables you to not have to worry about the logistics, and shipping, and the inventory management, the website, storefront and all that; it allows you to focus on the things that will build value in your company, which is the quality of your product, and your brand.”

He added that just like Shopify depends on the existence of a rich third-party ecosystem to support its platform, so does SkyWatch, and that ecosystem is only now reaching maturity after years of infrastructure development, including things like the proliferation of launch startups, ground station build out, data warehousing and more.

Ultimately, what SkyWatch provides is the ability to go to market “faster and more profitably,” Slifierz says, which is a major shift for hard tech satellite startups working on new and improved sensor capabilities, often spinning out of research labs at academic institutions.

“The strongest value proposition is that we give you instant access to hundreds of customers, which we’re growing at a very fast pace on the EarthCache [SkyWatch’s commercial satellite imagery marketplace] side. So in that way, we sort of joke, it’s like Shopify for space, but also integrated with the AWS marketplace.”

SkyWatch can also actually help identify demand, by providing satellite-side customers with real data to back up signals of what the market is actually looking for. Slifierz says that’s helped them advised partners on how to tweak their offering to meet a real need, which is beneficial in an industry where research and tech development often lead payload design, with actual demand as a somewhat secondary consideration.

#aws, #bullpen-capital, #business, #companies, #data-management, #drive-capital, #economy, #golden-ventures, #inventory-management, #new-york-city, #satellite, #satrevolution, #space, #space-capital, #startup-company, #tc, #waterloo

Stemma launches with $4.8M seed to build managed data catalogue

As companies increasingly rely on data to run their businesses, having accurate sources of data becomes paramount. Stemma, a new early stage startup, has come up with a solution, a managed data catalogue that acts as an organization’s source of truth.

Today the company announced a $4.8 million seed investment led by Sequoia with assorted individual tech luminaries also participating. The product is also available for the first time today.

Company co-founder and CEO Mark Grover says the product is actually built on top of the open source Amundsen data catalogue project that he helped launch at Lyft to manage its massive data requirements. The problem was that with so much data, employees had to kludge together systems to confirm the data validity. Ultimately manual processes like asking someone in Slack or even creating a Wiki failed under the weight of trying to keep up with the volume and velocity.

“I saw this problem first-hand at Lyft, which led me to create the open source Amundsen project with a team of talented engineers,” Grover said. That project has 750 users at Lyft using it every week. Since it was open sourced, 35 companies like Brex, Snap and Asana have been using it.

What Stemma offers is a managed version of Amundsen that adds additional functionality like using intelligence to show data that’s meaningful to the person who is searching in the catalogue. It also can add metadata automatically to data as it’s added to the catalogue, creating documentation about the data on the fly, among other features.

The company launched last fall when Grover and co-founder and CTO Dorian Johnson decided to join forces and create a commercial product on top of Amundsen. Grover points out that Lyft was supportive of the move.

Today the company has five employees, in addition to the founders and has plans to add several more this year. As he does that, he is cognizant of diversity and inclusion in the hiring process. “I think it’s super important that we continue to invest in diversity, and the two ways that I think are the most meaningful for us right now is to have early employees that are from diverse groups, and that is the case within the first five,” he said. Beyond that, he says that as the company grows he wants to improve the ratio, while also looking at diversity in investors, board members and executives.

The company, which launched during COVID is entirely remote right now and plans to remain that way for at least the short term. As the company grows, they will look at ways to build camaraderie like organizing a regular cadence of employee offsite events.

#amundsen, #data-management, #enterprise, #funding, #open-source, #recent-funding, #sequoia-capital, #startups, #stemma

Peloton and Echelon profile photos exposed riders’ real-world locations

Security researchers say at-home exercise giant Peloton and its closest rival Echelon were not stripping user-uploaded profile photos of their metadata, in some cases exposing users’ real-world location data.

Almost every file, photo or document contains metadata, which is data about the file itself, such as how big it is, when it was created, and by whom. Photos and video will often also include the location from where they were taken. That location data helps online services tag your photos or videos that you were at this restaurant or that other landmark.

But those online services — especially social platforms, where you see people’s profile photos — are supposed to remove location data from the file’s metadata so other users can’t snoop on where you’ve been, since location data can reveal where you live, work, where you go, and who you see.

Jan Masters, a security researcher at Pen Test Partners, found the metadata exposure as part of a wider look at Peloton’s leaky API. TechCrunch verified the bug by uploading a profile photo with GPS coordinates of our New York office, and checking the metadata of the file while it was on the server.

The bugs were privately reported to both Peloton and Echelon.

Peloton fixed its API issues earlier this month but said it needed more time to fix the metadata bug and to strip existing profile photos of any location data. A Peloton spokesperson confirmed the bugs were fixed last week. Echelon fixed its version of the bug earlier this month. But TechCrunch held this report until we had confirmation that both companies had fixed the bug and that metadata had been stripped from old profile photos.

It’s not known how long the bug existed or if anyone maliciously exploited it to scrape users’ personal information. Any copies, whether cached or scraped, could represent a significant privacy risk to users whose location identifies their home address, workplace, or other private location.

Parler infamously didn’t scrub metadata from user-uploaded photos, which exposed the locations of millions of users when archivists exploited weaknesses on the platform’s API to download its entire contents. Others have been slow to adopt metadata stripping, like Slack, even if it got there in the end.

Read more:

#api, #computing, #data, #data-management, #gps, #health, #information, #peloton, #pen-test-partners, #privacy, #security, #social-networks

Amplitude acquires Iteratively

Amplitude, the well-funded product intelligence startup that helps businesses use their data to predict which features will drive the best business outcomes for them, today announced that it has acquired Iteratively, a startup that helps businesses build trustworthy data pipelines.

Since data is at the core of Amplitude’s services, this acquisition will help the company bolster its data management capabilities that power many of its features, including its recently launched personalized recommendation engine.

This marks Amplitude’s second acquisition, after buying ClearBrain last March. With a total of $186.9 million in funding and a unicorn valuation, we may just see it do more of these in the no-so-far future. The two companies did not disclose the price of the acquisition.

Image Credits: Amplitude

“I think the big story for us is to be able to expand beyond product analytics to really help companies one, measure user behavior, two, predict which features and actions lead to business outcomes, and then use that to intelligently adapt each experience based on those insights,” Amplitude EVP of Product Justin Bauer told me. “For us, that’s really the broader problem that we’re focused on and why we think digital optimization is really critical.”

As Bauer noted, the Amplitude Behavioral Graph that powers these optimization services is only as effective as the data that is fed into it — and that’s why buying Iteratively made a lot of sense at this point.

Bauer and Iteratively CEO Patrick Thompson tell me that the two companies started talking in earnest around the end of last year after hearing more about Iteratively from Amplitude’s customers. “As we spoke to our customers, we consistently heard from them the importance of proactively and continuously increasing the quality of data,” he said. “And many of them told us how excited they were about Iteratively to help them solve that. As we dug in more, we found out that 70% of their customers are Amplitude customers. That gave us a lot of conviction to explore the acquisition.”

Image Credits: Amplitude

The two companies also had compatible cultures and Bauer tells me that the executive team at Amplitude was impressed by Iteratively’s founding team, which, like virtually everybody else at Iteratively, will move to Amplitude.

“The biggest thing for us is we saw the excitement from many of our existing customers who were using Amplitude and how that helped change the way that they built product and delivered customer value,” Iteratively’s Thompson said. “From the early days, we were really big fans of what Amplitude was building and saw the impact that it was having on the industry. For us, when we thought about folks who fit our cultural values, where we could see ourselves as well as the rest of the team working, I don’t think there was another company at the top of the list beyond Amplitude who fit all those criteria. And generally, as we’ve been having conversations with these folks internally, it just made sense.”

Iteratively will continue to exist as a stand-alone product and will continue to support a wide range of customers, including those who don’t use other Amplitude services. Bauer and Thompson stressed that this was important to both teams, in part because the data governance and quality problem expands well beyond the Amplitude audience. As such, a stand-alone product like Iteratively will also likely help Amplitude bring in new customers. But at the same time, the teams have already integrated some of Iteratively’s technology into Amplitude’s stack and specifically the existing Amplitude Govern service that helps businesses oversee their data pipelines.

This new integration is now available to Amplitude customers through an early access program, with wider availability planned for later this year.

#acquisition, #amplitude, #ceo, #data-governance, #data-management, #exit, #iteratively, #martech, #startup-company, #startups, #video-games

Analytics as a service: Why more enterprises should consider outsourcing

With an increasing number of enterprise systems, growing teams, a rising proliferation of the web and multiple digital initiatives, companies of all sizes are creating loads of data every day. This data contains excellent business insights and immense opportunities, but it has become impossible for companies to derive actionable insights from this data consistently due to its sheer volume.

According to Verified Market Research, the analytics-as-a-service (AaaS) market is expected to grow to $101.29 billion by 2026. Organizations that have not started on their analytics journey or are spending scarce data engineer resources to resolve issues with analytics implementations are not identifying actionable data insights. Through AaaS, managed services providers (MSPs) can help organizations get started on their analytics journey immediately without extravagant capital investment.

MSPs can take ownership of the company’s immediate data analytics needs, resolve ongoing challenges and integrate new data sources to manage dashboard visualizations, reporting and predictive modeling — enabling companies to make data-driven decisions every day.

AaaS could come bundled with multiple business-intelligence-related services. Primarily, the service includes (1) services for data warehouses; (2) services for visualizations and reports; and (3) services for predictive analytics, artificial intelligence (AI) and machine learning (ML). When a company partners with an MSP for analytics as a service, organizations are able to tap into business intelligence easily, instantly and at a lower cost of ownership than doing it in-house. This empowers the enterprise to focus on delivering better customer experiences, be unencumbered with decision-making and build data-driven strategies.

Organizations that have not started on their analytics journey or are spending scarce data engineer resources to resolve issues with analytics implementations are not identifying actionable data insights.

In today’s world, where customers value experiences over transactions, AaaS helps businesses dig deeper into their psyche and tap insights to build long-term winning strategies. It also enables enterprises to forecast and predict business trends by looking at their data and allows employees at every level to make informed decisions.

#analytics, #analytics-as-a-service, #artificial-intelligence, #big-data, #business-intelligence, #column, #data-management, #data-mining, #ec-column, #ec-enterprise-applications, #enterprise, #machine-learning, #predictive-analytics

Hacking my way into analytics: A creative’s journey to design with data

Growing up, did you ever wonder how many chairs you’d have to stack to reach the sky?

No? I guess that’s just me then.

As a child, I always asked a lot of “how many/much” questions. Some were legitimate (“How much is 1 USD in VND?”); some were absurd (“How tall is the sky and can it be measured in chairs?”). So far, I’ve managed to maintain my obnoxious statistical probing habit without making any mortal enemies in my 20s. As it turns out, that habit comes with its perks when working in product.

Growing up, did you ever wonder how many chairs you’d have to stack to reach the sky?

My first job as a product designer was at a small but energetic fintech startup whose engineers also dabbled in pulling data. I constantly bothered them with questions like, “How many exports did we have from that last feature launched?” and “How many admins created at least one rule on this page?” I was curious about quantitative analysis but did not know where to start.

I knew I wasn’t the only one. Even then, there was a growing need for basic data literacy in the tech industry, and it’s only getting more taxing by the year. Words like “data-driven,” “data-informed” and “data-powered” increasingly litter every tech organization’s product briefs. But where does this data come from? Who has access to it? How might I start digging into it myself? How might I leverage this data in my day-to-day design once I get my hands on it?

Data discovery for all: What’s in the way?

“Curiosity is our compass” is one of Kickstarter’s guiding principles. Powered by a desire for knowledge and information, curiosity is the enemy of many larger, older and more structured organizations — whether they admit it or not — because it hinders the production flow. Curiosity makes you pause and take time to explore and validate the “ask.” Asking as many what’s, how’s, why’s, who’s and how many’s as possible is important to help you learn if the work is worth your time.

#analytics, #business-intelligence, #column, #data-analysis, #data-management, #data-tools, #database, #developer, #ec-column, #startups

Data scientists: Bring the narrative to the forefront

By 2025, 463 exabytes of data will be created each day, according to some estimates. (For perspective, one exabyte of storage could hold 50,000 years of DVD-quality video.) It’s now easier than ever to translate physical and digital actions into data, and businesses of all types have raced to amass as much data as possible in order to gain a competitive edge.

However, in our collective infatuation with data (and obtaining more of it), what’s often overlooked is the role that storytelling plays in extracting real value from data.

The reality is that data by itself is insufficient to really influence human behavior. Whether the goal is to improve a business’ bottom line or convince people to stay home amid a pandemic, it’s the narrative that compels action, rather than the numbers alone. As more data is collected and analyzed, communication and storytelling will become even more integral in the data science discipline because of their role in separating the signal from the noise.

Data alone doesn’t spur innovation — rather, it’s data-driven storytelling that helps uncover hidden trends, powers personalization, and streamlines processes.

Yet this can be an area where data scientists struggle. In Anaconda’s 2020 State of Data Science survey of more than 2,300 data scientists, nearly a quarter of respondents said that their data science or machine learning (ML) teams lacked communication skills. This may be one reason why roughly 40% of respondents said they were able to effectively demonstrate business impact “only sometimes” or “almost never.”

The best data practitioners must be as skilled in storytelling as they are in coding and deploying models — and yes, this extends beyond creating visualizations to accompany reports. Here are some recommendations for how data scientists can situate their results within larger contextual narratives.

Make the abstract more tangible

Ever-growing datasets help machine learning models better understand the scope of a problem space, but more data does not necessarily help with human comprehension. Even for the most left-brain of thinkers, it’s not in our nature to understand large abstract numbers or things like marginal improvements in accuracy. This is why it’s important to include points of reference in your storytelling that make data tangible.

For example, throughout the pandemic, we’ve been bombarded with countless statistics around case counts, death rates, positivity rates, and more. While all of this data is important, tools like interactive maps and conversations around reproduction numbers are more effective than massive data dumps in terms of providing context, conveying risk, and, consequently, helping change behaviors as needed. In working with numbers, data practitioners have a responsibility to provide the necessary structure so that the data can be understood by the intended audience.

#column, #computing, #data, #data-management, #data-visualization, #developer, #ec-column, #ec-consumer-applications, #ec-enterprise-applications, #enterprise, #machine-learning, #peter-wang, #startups, #storytelling

Meroxa raises $15M Series A for its real-time data platform

Meroxa, a startup that makes it easier for businesses to build the data pipelines to power both their analytics and operational workflows, today announced that it has raised a $15 million Series A funding round led by Drive Capital. Existing investors Root, Amplify and Hustle Fund also participated in this round, which together with the company’s previously undisclosed $4.2 million seed round now brings total funding in the company to $19.2 million.

The promise of Meroxa is that can use a single platform for their various data needs and won’t need a team of experts to build their infrastructure and then manage it. At its core, Meroxa provides a single Software-as-a-Service solution that connects relational databases to data warehouses and then helps businesses operationalize that data.

Image Credits: Meroxa

“The interesting thing is that we are focusing squarely on relational and NoSQL databases into data warehouse,” Meroxa co-founder and CEO DeVaris Brown told me. “Honestly, people come to us as a real-time FiveTran or real-time data warehouse sink. Because, you know, the industry has moved to this [extract, load, transform] format. But the beautiful part about us is, because we do change data capture, we get that granular data as it happens.” And businesses want this very granular data to be reflected inside of their data warehouses, Brown noted, but he also stressed that Meroxa can expose this stream of data as an API endpoint or point it to a Webhook.

The company is able to do this because its core architecture is somewhat different from other data pipeline and integration services that, at first glance, seem to offer a similar solution. Because of this, users can use the service to connect different tools to their data warehouse but also build real-time tools on top of these data streams.

Image Credits: Meroxa

“We aren’t a point-to-point solution,” Meroxa co-founder and CTO Ali Hamidi explained. “When you set up the connection, you aren’t taking data from Postgres and only putting it into Snowflake. What’s really happening is that it’s going into our intermediate stream. Once it’s in that stream, you can then start hanging off connectors and say, ‘Okay, well, I also want to peek into the stream, I want to transfer my data, I want to filter out some things, I want to put it into S3.”

Because of this, users can use the service to connect different tools to their data warehouse but also build real-time tools to utilize the real-time data stream. With this flexibility, Hamidi noted, a lot of the company’s customers start with a pretty standard use case and then quickly expand into other areas as well.

Brown and Hamidi met during their time at Heroku, where Brown was a director of product management and Hamidi a lead software engineer. But while Heroku made it very easy for developers to publish their web apps, there wasn’t anything comparable in the highly fragmented database space. The team acknowledges that there are a lot of tools that aim to solve these data problems, but few of them focus on the user experience.

Image Credits: Meroxa

“When we talk to customers now, it’s still very much an unsolved problem,” Hamidi said. “It seems kind of insane to me that this is such a common thing and there is no ‘oh, of course you use this tool because it addresses all my problems.’ And so the angle that we’re taking is that we see user experience not as a nice-to-have, it’s really an enabler, it is something that enables a software engineer or someone who isn’t a data engineer with 10 years of experience in wrangling Kafka and Postgres and all these things. […] That’s a transformative kind of change.”

It’s worth noting that Meroxa uses a lot of open-source tools but the company has also committed to open-sourcing everything in its data plane as well. “This has multiple wins for us, but one of the biggest incentives is in terms of the customer, we’re really committed to having our agenda aligned. Because if we don’t do well, we don’t serve the customer. If we do a crappy job, they can just keep all of those components and run it themselves,” Hamidi explained.

Today, Meroxa, which the team founded in early 2020, has over 24 employees (and is 100% remote). “I really think we’re building one of the most talented and most inclusive teams possible,” Brown told me. “Inclusion and diversity are very, very high on our radar. Our team is 50% black and brown. Over 40% are women. Our management team is 90% underrepresented. So not only are we building a great product, we’re building a great company, we’re building a great business.”  

#api, #business-intelligence, #cloud, #computing, #data-management, #data-warehouse, #database, #developer, #drive-capital, #enterprise, #heroku, #hustle-fund, #information-technology, #nosql, #product-management, #recent-funding, #software-engineer, #startups, #web-apps

Startups must curb bureaucracy to ensure agile data governance

By now, all companies are fundamentally data driven. This is true regardless of whether they operate in the tech space. Therefore, it makes sense to examine the role data management plays in bolstering — and, for that matter, hampering — productivity and collaboration within organizations.

While the term “data management” inevitably conjures up mental images of vast server farms, the basic tenets predate the computer age. From censuses and elections to the dawn of banking, individuals and organizations have long grappled with the acquisition and analysis of data.

By understanding the needs of all stakeholders, organizations can start to figure out how to remove blockages.

One oft-quoted example is Florence Nightingale, a British nurse who, during the Crimean war, recorded and visualized patient records to highlight the dismal conditions in frontline hospitals. Over a century later, Nightingale is regarded not just as a humanitarian, but also as one of the world’s first data scientists.

As technology began to play a greater role, and the size of data sets began to swell, data management ultimately became codified in a number of formal roles, with names like “database analyst” and “chief data officer.” New challenges followed that formalization, particularly from the regulatory side of things, as legislators introduced tough new data protection rules — most notably the EU’s GDPR legislation.

This inevitably led many organizations to perceive data management as being akin to data governance, where responsibilities are centered around establishing controls and audit procedures, and things are viewed from a defensive lens.

That defensiveness is admittedly justified, particularly given the potential financial and reputational damages caused by data mismanagement and leakage. Nonetheless, there’s an element of myopia here, and being excessively cautious can prevent organizations from realizing the benefits of data-driven collaboration, particularly when it comes to software and product development.

Taking the offense

Data defensiveness manifests itself in bureaucracy. You start creating roles like “data steward” and “data custodian” to handle internal requests. A “governance council” sits above them, whose members issue diktats and establish operating procedures — while not actually working in the trenches. Before long, blockages emerge.

Blockages are never good for business. The first sign of trouble comes in the form of “data breadlines.” Employees seeking crucial data find themselves having to make their case to whoever is responsible. Time gets wasted.

By itself, this is catastrophic. But the cultural impact is much worse. People are natural problem-solvers. That’s doubly true for software engineers. So, they start figuring out how to circumvent established procedures, hoarding data in their own “silos.” Collaboration falters. Inconsistencies creep in as teams inevitably find themselves working from different versions of the same data set.

#agile-software-development, #business-intelligence, #column, #data-governance, #data-management, #ec-column, #ec-cybersecurity, #startups, #tc

No-code business intelligence service y42 raises $2.9M seed round

Berlin-based y42 (formerly known as Datos Intelligence), a data warehouse-centric business intelligence service that promises to give businesses access to an enterprise-level data stack that’s as simple to use as a spreadsheet, today announced that it has raised a $2.9 million seed funding round led by La Famiglia VC. Additional investors include the co-founders of Foodspring, Personio and Petlab.

The service, which was founded in 2020, integrates with over 100 data sources, covering all the standard B2B SaaS tools from Airtable to Shopify and Zendesk, as well as database services like Google’s BigQuery. Users can then transform and visualize this data, orchestrate their data pipelines and trigger automated workflows based on this data (think sending Slack notifications when revenue drops or emailing customers based on your own custom criteria).

Like similar startups, y42 extends the idea data warehouse, which was traditionally used for analytics, and helps businesses operationalize this data. At the core of the service is a lot of open source and the company, for example, contributes to GitLabs’ Meltano platform for building data pipelines.

y42 founder and CEO Hung Dang

y42 founder and CEO Hung Dang.

“We’re taking the best of breed open-source software. What we really want to accomplish is to create a tool that is so easy to understand and that enables everyone to work with their data effectively,” Y42 founder and CEO Hung Dang told me. “We’re extremely UX obsessed and I would describe us as no-code/low-code BI tool — but with the power of an enterprise-level data stack and the simplicity of Google Sheets.”

Before y42, Vietnam-born Dang co-founded a major events company that operated in over 10 countries and made millions in revenue (but with very thin margins), all while finishing up his studies with a focus on business analytics. And that in turn led him to also found a second company that focused on B2B data analytics.

Image Credits: y42

Even while building his events company, he noted, he was always very product- and data-driven. “I was implementing data pipelines to collect customer feedback and merge it with operational data — and it was really a big pain at that time,” he said. “I was using tools like Tableau and Alteryx, and it was really hard to glue them together — and they were quite expensive. So out of that frustration, I decided to develop an internal tool that was actually quite usable and in 2016, I decided to turn it into an actual company. ”

He then sold this company to a major publicly listed German company. An NDA prevents him from talking about the details of this transaction, but maybe you can draw some conclusions from the fact that he spent time at Eventim before founding y42.

Given his background, it’s maybe no surprise that y42’s focus is on making life easier for data engineers and, at the same time, putting the power of these platforms in the hands of business analysts. Dang noted that y42 typically provides some consulting work when it onboards new clients, but that’s mostly to give them a head start. Given the no-code/low-code nature of the product, most analysts are able to get started pretty quickly  — and for more complex queries, customers can opt to drop down from the graphical interface to y42’s low-code level and write queries in the service’s SQL dialect.

The service itself runs on Google Cloud and the 25-people team manages about 50,000 jobs per day for its clients. the company’s customers include the likes of LifeMD, Petlab and Everdrop.

Until raising this round, Dang self-funded the company and had also raised some money from angel investors. But La Famiglia felt like the right fit for y42, especially due to its focus on connecting startups with more traditional enterprise companies.

“When we first saw the product demo, it struck us how on top of analytical excellence, a lot of product development has gone into the y42 platform,” said Judith Dada, General Partner at LaFamiglia VC. “More and more work with data today means that data silos within organizations multiply, resulting in chaos or incorrect data. y42 is a powerful single source of truth for data experts and non-data experts alike. As former data scientists and analysts, we wish that we had y42 capabilities back then.”

Dang tells me he could have raised more but decided that he didn’t want to dilute the team’s stake too much at this point. “It’s a small round, but this round forces us to set up the right structure. For the series, A, which we plan to be towards the end of this year, we’re talking about a dimension which is 10x,” he told me.

#alteryx, #analytics, #berlin, #big-data, #business-intelligence, #business-software, #ceo, #cloud, #data, #data-analysis, #data-management, #data-warehouse, #enterprise, #general-partner, #information-technology, #judith-dada, #recent-funding, #shopify, #sql, #startups, #vietnam

A crypto company’s journey to Data 3.0

Data is a gold mine for a company.

If managed well, it provides the clarity and insights that lead to better decision-making at scale, in addition to an important tool to hold everyone accountable.

However, most companies are stuck in Data 1.0, which means they are leveraging data as a manual and reactive service. Some have started moving to Data 2.0, which employs simple automation to improve team productivity. The complexity of crypto data has opened up new opportunities in data, namely to move to the new frontier of Data 3.0, where you can scale value creation through systematic intelligence and automation. This is our journey to Data 3.0.

Coinbase is neither a finance company nor a tech company — it’s a crypto company. This distinction has big implications for how we work with data. As a crypto company, we work with three major types of data (instead of the usual one or two types of data), each of which is complex and varied:

  1. blockchain: decentralized and publicly available
  2. product: large and real-time
  3. financial: high-precision and subject to many financial/legal/compliance regulations.

Our focus has been on how we can scale value creation by making this varied data work together, eliminating data silos, solving issues before they start and creating opportunities for Coinbase that wouldn’t exist otherwise.

Having worked at tech companies like LinkedIn and eBay, and also those in the finance sector, including Capital One, I’ve observed firsthand the evolution from Data 1.0 to Data 3.0. In Data 1.0, data is seen as a reactive function providing ad-hoc manual services or firefighting in urgent situations.

#artificial-intelligence, #coinbase, #column, #data-analysis, #data-management, #developer, #ec-cloud-and-enterprise-infrastructure, #ec-column, #enterprise, #machine-learning, #payments, #tc

Microsoft Azure expands its NoSQL portfolio with Managed Instances for Apache Cassandra

At its Ignite conference today, Microsoft announced the launch of Azure Managed Instance for Apache Cassandra, its latest NoSQL database offering and a competitor to Cassandra-centric companies like Datastax. Microsoft describes the new service as a ‘semi-managed offering that will help companies bring more of their Cassandra-based workloads into its cloud.

“Customers can easily take on-prem Cassandra workloads and add limitless cloud scale while maintaining full compatibility with the latest version of Apache Cassandra,” Microsoft explains in its press materials. “Their deployments gain improved performance and availability, while benefiting from Azure’s security and compliance capabilities.”

Like its counterpart, Azure SQL Manages Instance, the idea here is to give users access to a scalable, cloud-based database service. To use Cassandra in Azure before, businesses had to either move to Cosmos DB, its highly scalable database service which supports the Cassandra, MongoDB, SQL and Gremlin APIs, or manage their own fleet of virtual machines or on-premises infrastructure.

Cassandra was originally developed at Facebook and then open-sourced in 2008. A year later, it joined the Apache Foundation and today it’s used widely across the industry, with companies like Apple and Netflix betting on it for some of their core services, for example. AWS launched a managed Cassandra-compatible service at its re:Invent conference in 2019 (it’s called Amazon Keyspaces today), Microsoft only launched the Cassandra API for Cosmos DB last November. With today’s announcement, though, the company can now offer a full range of Cassandra-based servicer for enterprises that want to move these workloads to its cloud.

#amazon, #apache-cassandra, #api, #apple, #aws, #cloud, #computing, #data, #data-management, #datastax, #developer, #enterprise, #facebook, #microsoft, #microsoft-ignite-2021, #microsoft-azure, #mongodb, #netflix, #nosql, #sql, #tc

After 80% ARR growth in 2020, Saltmine snags $20M to help employees return to a ‘new normal’ office

What is working in the office going to look like in a post-COVID-19 world?

That’s something one startup hopes to help companies figure out.

Saltmine, which has developed a web-based workplace design platform, has raised $20 million in a Series A funding round.

Existing backers Jungle Ventures and Xplorer Capital led the financing, which also included participation from JLL Spark, the strategic investment arm of commercial real estate brokerage JLL. 

Notably, JLL is not only investing in Saltmine, but is also partnering with the San Francisco-based startup to sell its service directly to its clients — opening up a whole new revenue stream for the four-year-old company.

Saltmine claims its cloud-based technology does for corporate real estate heads what Salesforce did for CROs in digitizing and streamlining the office design process. It saw an 80% spike in ARR (annual recurring revenue) last year while doubling the number of companies it works with, according to CEO and founder Shagufta Anurag. Its more than 35 customers include PG&E, Snowflake, Fidelity and Workday, among others. Its mission, put simply, is to help companies “create the best possible workplaces for their employees.”

Saltmine claims to have a 95% customer retention rate and in 2020 saw 350% year over year growth in monthly active users of its SaaS platform. So far, the square footage of all the office real estate properties designed and analyzed by customers on Saltmine totals 50 million square feet across 1,500 projects.

Saltmine says it offers companies tools to do things like establish social distancing measures in the office. Its platform, the company says, houses all workplace data — including strategy, design, pricing and portfolio analytics — in one place. It combines and analyzes floor plans with project requirements with real-time behavioral data (aggregated through a combination of utilization sensors and employee feedback) to identify companies’ design needs. Besides aiming to improve the workplace design process, Saltmine claims to be able to help companies “optimize their real estate portfolios.”

The pandemic has dramatically increased the need for a digital transformation of how workplaces are designed and reimagined, according to Anurag. 

“Given the need for social distancing capabilities and a greater emphasis on work-life balance in many office settings, few workers expect a complete ‘return to normal,’ ” she said. “There is now enormous pressure on corporate heads of real estate to adapt and modify their workplaces.”

Once companies identify their new needs, Saltmine uses “immersive” digital 3D renderings to help them visualize the necessary changes to their real estate properties.

Singapore-based Anurag has previous experience in the design world, having founded Space Matrix, a large interior design firm in Asia, as well as Livspace, a digital home interior design company.

“I saw the same pain points and unmet needs in office real estate that I did in the residential market,” she said. “Real estate is the second-largest cost for companies and has a direct impact on their largest cost — their people.”

Looking ahead, Saltmine plans to use its new capital to (naturally) do some hiring and continue to acquire customers — in particular, seeking to expand its portfolio of Global 2000 companies.

Saltmine has about 125 employees in five offices across Asia, Europe and North America. It expects to have 170 employees by year’s end and to be profitable by the end of fiscal year 2021.

The company’s initial focus has been in North America, but it is now beginning to expand into APAC and Australia. 

JLL Technologies’ co-CEO Yishai Lerner said JLL Spark was drawn to Saltmine’s approach of making data and analytics accessible in one place.

“Having a single source of truth for data also facilitates collaboration across teams, which is important, for example, in workspace planning,” he told TechCrunch. “This reduces inefficiencies and improves workflows in today’s fragmented design, build and fit-out market.”

JLL Spark invests in companies that it believes can benefit from its distribution and network — hence the firm’s agreement to sell Saltmine’s software directly to its customers.

“As JLL tenants and clients continue to embrace the future of work, they are seeking technology solutions that keep their buildings running efficiently and effectively,” Lerner said. “Saltmine’s platform checks all of the boxes by streamlining stakeholder collaboration, increasing transparency and simplifying data management.”

#data-management, #funding, #future-of-work, #jll-spark, #jll-technologies, #jungle-ventures, #real-estate, #recent-funding, #saas, #saltmine, #san-francisco, #startups, #tc, #venture-capital, #xplorer-capital

Census raises $16M Series A to help companies put their data warehouses to work

Census, a startup that helps businesses sync their customer data from their data warehouses to their various business tools like Salesforce and Marketo, today announced that it has raised a $16 million Series A round led by Sequoia Capital. Other participants in this round include Andreessen Horowitz, which led the company’s $4.3 million seed round last year, as well as several notable angles, including Figma CEO Dylan Field, GitHub CTO Jason Warner, Notion COO Akshay Kothari and Rippling CEO Parker Conrad.

The company is part of a new crop of startups that are building on top of data warehouses. The general idea behind Census is to help businesses operationalize the data in their data warehouses, which was traditionally only used for analytics and reporting use cases. But as businesses realized that all the data they needed was already available in their data warehouses and that they could use that as a single source of truth without having to build additional integrations, an ecosystem of companies that operationalize this data started to form.

The company argues that the modern data stack, with data warehouses like Amazon Redshift, Google BigQuery and Snowflake at its core, offers all of the tools a business needs to extract and transform data (like Fivetran, dbt) and then visualize it (think Looker).

Tools like Census then essentially function as a new layer that sits between the data warehouse and the business tools that can help companies extract value from this data. With that, users can easily sync their product data into a marketing tool like Marketo or a CRM service like Salesforce, for example.

Image Credits: Census

Three years ago, we were the first to ask, ‘Why are we relying on a clumsy tangle of wires connecting every app when everything we need is already in the warehouse? What if you could leverage your data team to drive operations?’ When the data warehouse is connected to the rest of the business, the possibilities are limitless.” Census explains in today’s announcement. “When we launched, our focus was enabling product-led companies like Figma, Canva, and Notion to drive better marketing, sales, and customer success. Along the way, our customers have pulled Census into more and more scenarios, like auto-prioritizing support tickets in Zendesk, automating invoices in Netsuite, or even integrating with HR systems.

Census already integrates with dozens of different services and data tools and its customers include the likes of Clearbit, Figma, Fivetran, LogDNA, Loom and Notion.

Looking ahead, Census plans to use the new funding to launch new features like deeper data validation and a visual query experience. In addition, it also plans to launch code-based orchestration to make Census workflows versionable and make it easier to integrate them into enterprise orchestration system.

#andreessen-horowitz, #business-intelligence, #canva, #ceo, #clearbit, #computing, #crm, #cto, #data-management, #data-warehouse, #dylan-field, #enterprise, #figma, #fivetran, #github, #google, #information, #information-technology, #logdna, #looker, #loom, #marketo, #netsuite, #notion, #parker-conrad, #recent-funding, #salesforce, #sequoia-capital, #startups, #tc, #warehouse, #zendesk

Big data VC OpenOcean hits $111.5M for third fund, appoints Ekaterina Almasque to GP

OpenOcean, a European VC which has tended to specialise in big data-oriented startups and deep tech, has reach the €92 million ($111.5 million) mark for its third main venture fund, and is aiming for a final close of €130 million by mid-way this year. LPs in the new fund include the European Investment Fund (EIF), Tesi, pension funds, major family offices and Oxford University’s Corpus Christi College.

Ekaterina Almasque — who has already led investments in IQM (superconducting quantum machines) and Sunrise.io (multi-cloud hyper-converged infrastructure) and is leading the London team and operations for the firm — has been appointed as general partner. Before joining, Almasque was a managing director at Samsung Catalyst Fund in Europe, led investments in Graphcore’s processor for Artificial Intelligence, Mapillary’s layer for rapid mapping and AIMotive’s autonomous driving stack.

The enormous wealth of data in the modern world means the next generation of software is being built at the infrastructure. Thus, the fund said it would invest primarily at the Series A level with initial investments of €3 million to €5 million, across OpenOcean’s principle areas of artificial intelligence, application-driven data infrastructure, intelligent automation and open source.

OpenOcean’s team includes Michael “Monty” Widenius, the “spiritual father” of MariaDB, and one of the original developers of MySQL, the predecessor to MariaDB; Tom Henriksson, who invested in MySQL and MariaDB; as well as Ralf Wahlsten and Patrik Backman.

Tom Henriksson, general partner at OpenOcean, commented: “Ekaterina… brings an immense amount of expertise to the team and exemplifies the way we want to support our founders. Fund 2020 is an important step for OpenOcean, with prestigious LPs trusting our approach and our knowledge, and believing in our ability to identify the very best data solutions and infrastructure technologies in Europe.”

Almasque said: “The next five years will be critical for digital infrastructure, as breakthrough technologies are currently being constrained by the capabilities of the stack. Enabling this next level of infrastructure innovation is crucial to realising digitisation projects across the economy and will determine what the internet of the future looks like. We’re excited by the potential of world-leading businesses being built across Europe and are looking forward to supporting the next generation of software leaders.”

Speaking to TechCrunch she added: “It’s very rare to find such a VC so deep in the stack which also invested in one of the first unicorns in Europe and really built the open source ecosystem globally. So for me, this was absolutely an interesting team to join. And what OpenOcean was doing since inception in 2011 was very unique among pioneering ecosystems, such as big data analytics… and it remains very pioneering, pushing the frontiers in artificial intelligence and now quantum computing. This is what really attracts me, and I think there is a very, very big future.”

In an interview Henriksson told me: “What we are seeing is that our economy is shifting more and more towards the digital, data-driven economy. It started with few industries, but now we see a larger shift, including new industries like healthcare, like manufacturing.”

Asked about the effects of the pandemic on the sector, he said: “Obviously we see a lot of startups who are plugging into things like the UiPath platform. This is very relevant for the pandemic. Because the companies that had started automating strongly before the pandemic hit… they’ve actually accelerated and they find benefits for their teams and organisations and actually the people are happier because they have better automation technologies in place. The ones that didn’t start before [the pandemic hit] they’re a little behind now.”

#aria, #artificial-intelligence, #big-data, #computing, #data-management, #databases, #drupal, #europe, #european-investment-fund, #infrastructure, #london, #manufacturing, #mapillary, #mariadb, #mysql, #openocean, #tc, #venture-capital, #wordpress

LyteLoop raises $40 million to launch satellites that use light to store data

Soon, your cloud photo backups could reside on beams of light transmitted between satellites instead of in huge, power-hungry server farms here on Earth. Startup LyteLoop has spent the past five years doing tackling the physics challenges that can make that possible, and now it’s raised $40 million to help it leapfrog the remaining engineering hurdles to make its bold vision a reality.

LyteLoop’s new funding will provide it with enough runway to achieve its next major milestone: putting three prototype satellites equipped with its novel data storage technology into orbit within the next three years. The company intends to build and launch six of these, which will demonstrate how its laser-based storage medium operates on orbit.

I spoke to LyteLoop CEO Ohad Harlev about the company’s progress, technology and plans. Harlev said five years into its founding, the company is very confident in the science that underlies its data storage methods – and thrilled about the advantages it could offer over traditional data warehousing technology used today. Security, for instance, gets a big boost from LyteLoop’s storage paradigm.

“Everybody on every single data center has the same same possible maximum level of data security,” he said. “We can provide an extra four layers of cyber security, and they’re all physics-based. Anything that can be applied on Earth, we can apply in our data center, but for example, the fact that we’re storing data on photons, we could put in quantum encryption, which others can’t. Plus, there are big security benefits because the data is in motion, in space, and moving at the speed of light.”

On top of security, LyteLoop’s model also offers benefits when it comes to privacy, because the data it’s storing is technically always in transit between satellites, which means it’ll be subject to an entirely different set of regulations vs. those that come into play when you’re talking about data which is warehoused on drives in storage facilities. LyteLoop also claims advantages in terms of access, because the storage and the network are one in the same, with the satellites able to provide their information to ground stations anywhere on Earth. Finally, Harlev points out that it’s incredibly power efficient, and also ecologically sound in terms of not requiring million of gallons of water for cooling, both significant downsides of our current data center storage practices.

On top of all of that, Harlev says that LyteLoop’s storage will not only be cost-competitive with current cloud-based storage solutions, but will in fact be more affordable – even without factoring in likely decreases to come in launch costs as SpaceX iterates on its own technology and more small satellite launch providers, including Virgin Orbit and Rocket Lab, come online and expand their capacity.

“Although it’s more expensive to build and launch the satellite, it is still a lot cheaper to maintain them in the space,” he said. “So when we do a total cost of ownership calculation, we are cheaper, considerably cheaper, on a total cost of ownership basis. However […] when we compare what the actual users can do, you know, we can definitely go to completely different pricing model.”

Harlev is referring to the possibility of bundled pricing for combining storage and delivery – other providers would require that you supply the network, for instance, in order to move the data you’re storing. LyteLoop’s technology could also offset existing spend on reducing a company’s carbon footprint, because of its much-reduced ecological impact.

The company is focused squarely on getting its satellites to market, with a plan to take its proof of concept and expand that to a full production satellite roughly five years form now, with an initial service offering made available at that time. But LyteLoop’s tech could have equally exciting applications here on Earth. Harlev says that if you created a LyteLoop data center roughly the size of a football field, it would be roughly 500 times as efficient at storing data vs. traditional data warehousing.

The startup’s technology, which essentially stores data on photons instead of physical media, just requires far less matter than do our current ways of doing things, which not only helps its environmental impact, but which also makes it a much more sensible course for in-space storage when compared to physical media. The launch business is all about optimizing mass to orbit in order to reduce costs, and as Harlev notes, photons are massless.

#aerospace, #ceo, #cloud-computing, #computing, #data-management, #elon-musk, #funding, #hyperloop, #laser, #physical-media, #quantum-encryption, #recent-funding, #rocket-lab, #satellite, #small-satellite, #space, #spaceflight, #spacex, #startup, #startups, #tc, #technology, #virgin-orbit