Expect Labs has been on a bit of a tear lately — within the past six months the San Francisco-based startup showed off its fan-favorite app at Disrupt SF 2012, scored itself $2.4 million in funding from some big names, and inked a deal with the voice recognition mavens at Nuance. Now Expect Labs has tapped big data startup Factual Inc. and its sizable store of location data in a bid to make its vaunted Anticipatory Computing Engine even smarter.
Here’s a quick primer on Factual, in case you’re not quite up to speed. The brainchild of founder Gil Ebaz, Factual first hit the scene back in 2007 as a place for users to share data of all sorts. No really — some of the company’s earliest examples included crowdsourced data on U.S. prisoners on death row and American Idol finalists. Eventually, Factual shifted to focus more on certain subsets of data like local businesses and regional points of interest and made it all available to developers by way of an API. Oh, and they locked up a $25 million Series A round led by Andreessen Horowitz and Index along the way — not too shabby.
And how exactly does this fit in with Expect Lab’s oeuvre? Well, the Anticipatory Computing Engine is all about analyzing ongoing conversations and preemptively offering up information it thinks is relevant to what’s going on (as seen in the demo of the team’s consumer-facing MindMeld iPad app below). Currently, Factual plays home to data on “58 million local businesses and points of interest in 50 countries,” a huge sum of information for the team’s ACE to serve up as it listens in on conversations in progress.
We’ve already begun to see a shift towards more context-sensitive computing experiences, especially where mobile devices are involved. Google Now is perhaps the most prevalent as its baked directly into newer versions of Android and chews on a user’s calendar, travel preferences, and search history to offer up relevant notifications right when they’re needed. Meanwhile, apps like Grokr for iOS (which also happens to lean on Factual’s data) take that concept and made it more passive and predictive — not entirely unlike Expect Labs’ approach actually.
Sadly, the MindMeld iPad app that was meant to be Expect Labs’ first big, publicly available demonstration of the anticipatory computing engine in action is still being worked on. The company originally aimed to push the app out the door shortly after its turn on our Disrupt SF 2012 stage, but tremendous positive response prompted the team to apply a few more coats of polish before releasing it into the wild. Since then the startup has made appearances at CES (where MindMeld was fondly received once again), but there’s still no hard release date — company representatives have said it’s coming “really soon.”
Article courtesy of TechCrunch
Factual is a company that provides open data sets for developers, most notably its location data sets and APIs. The company just announced that it has added thousands of new locations to its U.S. point of interest sets, for a total of more than 22 million places in the U.S and 62 million places around the world.
But with all these places, how might developers filter through these to make sure the results are relevant for their apps? Well, Factual also announced a new feature: Place Rank, which will enable developers to sort query results by relevance.
The documentation for Place Rank explains:
Factual Place Rank is a relative approximation of the place’s significance according to its electronic footprint; it is supported as a sort option by default. The rank itself is not available as a value in the API or download.
In other words it will be a bit like Google search results — the results are ordered by ranking, but you can’t see what those rankings are or how they’re determined.
The new location data just added includes:
The company has also added more than 50 new categories for organizing the places, as well as support for treating local and national chains differently than other businesses. So far it can reliably recognize 100 chains (listed here), and there is experimental support for 800 more.
Rather than risk breaking old applications, the expanded U.S. data set is actually being released as a new, separate data set.
Factual was founded in 2007 by Gil Elbaz, the co-founder of Applied Semantics, which Google acquired for its AdSense technology. Factual raised a $25 million series A led by Andreessen Horowitz and Index Ventures in 2010.
Article courtesy of TechCrunch
Open data platform Factual.com is beefing up its Global Places offering today with three new APIs that will provide mobile developers with access to a ton of new data which can help them build better location-aware apps. But the company notes that the APIs’ launch will be of special interest to mobile ad providers, including mobile ad networks, demand-side platforms and agencies, who are looking for new data points around geography. This is particularly important on mobile where traditional methods of ad targeting – beacons and cookies – aren’t viable.
The Geopluse API (beta) is the first of three, and works to reveal directionally where users intend to go, rather than signaling their arrival at a destination. The API provides everything Factual knows about the location. You provide it latitude and longitude, and Factual returns additional attributes which it calls “pulses.” These pulses use Factual’s network of signals, calculated metrics, and census data, which come from Factual itself, publicly-available data, or from third parties.
The first few “pulses” to become available include:
The more interesting ones here are the Factual Commercial Density and Factual Commercial Profile. The company has taken its Places data and provided overviews of the density and type of businesses in the area.
A potential use case for this API could involve a brand like Starbucks, which wants to know when, where and to whom it should serve its ads to in order to get the highest yield and conversions. With the API, the company would know not only the exact location of the consumer (the latitude and longitude), but also all the contextual info about and around the location, too.
More “pulses” will arrive in the next few months. Factual says that pricing will depend on use case and usage volume.
The second API is the Reverse Geocoder API (beta) which converts longitude and latitude into an address (U.S. only) or region (49 other countries). There a a few of these out there already from Google, Yahoo and MapQuest, Factual notes. While the company says that it does not see itself getting into the mapping business, it does see a need to serve its own API to complement the other offerings in its Places product.
Finally, there is the world World Geographies API (beta) which is primarily used to translate place names between languages, and determine what cities are found in what regions, what states in what countries, etc. This is another complementary service, as Factual has already published small businesses and landmarks. This adds 6 million more natural and administrative geographies.
Factual admits that there are a few other players that have similar products to those it announced today, but wants to differentiate itself with data (especially in terms of the Commercial Density data, above), speed and scale. The company claims to provide near real-time access to these datasets well-under 100ms.
Article courtesy of TechCrunch
Open data platform Factual.com is launching a new API for developers of location-based services called Resolve. The API is an entity resolution API that makes partial records complete, matches entities against one other and assists in the process of de-duping and normalizing datasets.
What this means is that developers can simply tell Factual what they know about an entity (i.e., a venue in a place database) and it will fill in the missing pieces (e.g., the category, the latitude/longitude info and venue’s address). At launch, Resolve will be available only for Factual’s list of U.S. Places, but the company hopes to expand Resolve globally in the future.
Resolve is one of those under-the-hood type launches that is going to make many engineers’ lives much easier. To use, a developer sends what they know about a place to Resolve as a GET request with the attributes included as JSON-encoded key/value pairs.
The API then, well…resolves the request by looking at all the possible candidates in Factual’s dataset and returns a solid match (if one can be identified) and all of the missing attributes.
A couple of companies are already using Resolve, including daily deal API provider Sqoot and restaurant menu platform OpenMenu.com. Sqoot uses Resolve to convert the business name and address within a daily deal to a geo-referenced entity, and provides users with the most hyperlocal and geo-relevant deal recommendations possible. Meanwhile, OpenMenu uses resolve to identify restaurants in Factual that match those in Factual and then pushes this info on to Factual’s Crosswalk API.
Crosswalk, another Factual Places API, tells you the URL and ID of a place in up to 40 other third-party namespaces including Foursquare, Urbanspoon, Citysearch, Yellowpages, Yahoo, AllMenus, Yelp, Zagat, Chow, Gowalla, InsiderPages, MenuPages, Menupix, SimpleGeo, Superpages, Explore To, Fwix and others.
Documentation on how to use the newly launched Factual Resolve API, including examples and requirements is available here.
Factual is an open data platform for application developers that leverages large scale aggregation and community exchange. For example, you will find datasets for millions of U.S. and International local businesses and points of interest, as well as datasets on entertainment, education, and health. Developers can access the data through an API, use our mobile SDK, or download the data directly.
Factual was founded in 2007 by Gil Elbaz, co-founder of Applied Semantics (which launched ASI’s AdSense product). Applied…
Article courtesy of TechCrunch
In the quest for a unified database of places, geo-location startup Factual is making big strides. Today it is announcing a major partnership with SimpleGeo to maintain and power its places database, which up until now has offered a competing database of places in the eyes of developers.
The merged database will have 30 million places, and be maintained and updated by Factual. Developers will be able to access the database either through SimpleGeo or Factual. “It’s Factual’s dataset, our interface,” says SimpleGeo CEO Jay Adelson.
SimpleGeo instead will focus on its other geo-infrastructure services for developers, such as its geo-storage and geo-context services (which delivers relevant information about a particular place from weather to voting districts to neighborhood boundaries).
Other startups such as Fwix are also gravitating towards Factual as the repository of places data. But don’t expect an unified places database anytime soon. Bigger players like Google, Facebook, and Foursquare will continue to build out their own places databases.
But Factual, which is backed by more than $25 million from Andreessen Horowitz, Index, GRP, SV Angel and others has the resources to go up against the bigger players.
Adelson tells me why SimpleGeo decided to go with Factual. One reason is Factual’s machine learning capabilities which it applies to keeping the geo-data fresh and accurate, along with “the ability to merge data as it arrives.” But the other reason is “there is a degree of neutrality.” While he didn’t name names, it’s not too hard to guess who he thinks is not neutral. “There are other data sets that are proprietary, or geared towards a single use,” he says. These other partners insist that “you have to use our maps, our check-ins that drives my business. Factual is in a neutral position, accepting different data sources. We looked at these different data partners and how many data sources they can integrate. I think it is a longterm bet when I am betting on Factual.”
Article courtesy of TechCrunch
With a fresh infusion of $4 million from Comcast Interactive Capital, which it raised recently in a series B financing (with previous investor BlueRun also pitching in), hyperlocal places database Fwix is pushing out a major upgrade to its developer API today. Fwix is creating an open database of places in partnership with Factual . Developers will be able to pull data about millions of places into their own apps, and edit the places as well by adding their own data or content.
Fwix started focusing on becoming more of a places database last year, but its API was clunky. Nevertheless, it served 20 million API calls in March, up from 11 million in February. Now with the new developer API, it will be easier to associate places with content such as news articles, Tweets, and check-ins. The API also includes an advertising layer which plugs into various mobile ad networks and geo-specific offers, including ones from Groupon, LivingSocial, YellowPages.com, and Gilt City.
Other efforts to create an open places database exist. What is unique about Fwix’s places database is that it can append location-specific content to a place such as news articles, Tweets, and photos. The Fwix database is set up to make it easy for developers to mix and match places with content and bring both into their apps. They can filter places by popularity (measured by check-ins) or other ways, and write back into the database with data generated by their users.
Article courtesy of TechCrunch
Spend enough time in Silicon Valley, and of all the buzz words you’ll hear neatly tucked in with “graph,” “serendipity,” and “personalization” is one often uttered though, on the whole, not yet fully understood: “Big Data.” On the surface, everyone realizes the opportunity. Data is being generated at lightning speed, the cost of storing is tiny, and new technologies are available to help manage, organize, and secure the data. Earlier this month, LinkedIn co-founder and Greylock partner Reid Hoffman delivered a presentation on this topic at SxSW, and starting next week, GigaOM’s annual big data conference “Structure’” kicks off in NYC.
At the consumer level, while we are wowed by pretty visualizations, the real advancements in big data technologies cover (1) how data is structured and stored, (2) how it is organized and retrieved, and, most interesting to me, (3) how underlying mathematics can be written into algorithms to leverage the data and help discover entirely new things. I’ll paraphrase from one data scientist, LinkedIn’s Peter Skomorach, who notes on Quora that cheap data storage allows users to leverage asymmetric information, larger data sets increase the likelihood that new insights can be found, and machine learning advancements can be used in entirely new, game-changing ways.
This being Silicon Valley, the obvious targets in sight concern the massive bits of data generated online through social networks, e-commerce, mobile location, and advertising technology. There are no surprises here, and some of the best data scientists happen to reside within these social networks, such as Dmitry Ryaboy from Twitter, Jeff Hammerbacher from Cloudera (formerly of Facebook) Deepak Singh from Amazon, and Skomorach and DJ Patil from LinkedIn. Not only is the amount of data generated within social networks staggering, but the pace at which its generated and its complexity are both accelerating. Beyond the data visualizations captured by social network maps, the opportunities that lie hidden within those relationships is phenomenal and will feed into social commerce, context-awareness, and location-based ads.
These are the current “hot spots” for big data. There are many companies working on some angle within “big data,” and some which have a long history. Earlier this week, Aster Data Systems was acquired by TeraData, and there are plenty of firms focused on some aspect of data. Dataguise focuses on “masking” sensitive data that is either regulated by law or corporate policy, protecting information from external and internal breaches. Lattice Engines uses algorithms to provide its clients with predictive analytics and learning. Cloudera develops and distributes Hadoop, which powers data processing for websites. And companies like Factual and InfoChimps provide platforms where anyone can share and manipulate data on any subject. (While there are many companies focused on big data, I’ll highlight a few and ask the crowd to help input more into the system, here, and follow up on Quora.)
One of the big data companies to break out into the mainstream tech press is located in downtown Palo Alto: Palantir Technologies. As TechCrunch’s Leena Rao pointed out in June 2010, after the company raised Series D funding, big data companies, and especially Palantir, don’t capture much social media attention. They are instead busy selling their flagship products, Palantir Government and Palantir Finance, to government and financial institutions worldwide. Big data investors know the writing is on the wall: Palantir’s Chairman, Peter Thiel, has been on the record about big data and believes the company will not only cross the billion-dollar threshold, but shoot past it. Will it help securities regulators find the next crisis or Bernie Madoff? Will it help governments monitor potential terrorist activity and provide actionable information before it’s too late? These are big problems that affect our society and for which we don’t have the best solutions. We needed solutions yesterday, and when Palantir and other companies help us identify and head off these threats, they will be rewarded a billion times over.
Now, let’s take big data one step further. Whether we’re all data scientists or not, we understand the scale of the opportunity. We know there’s smart money to invest in data storage, masking, security, retrieval, analysis, and visualizations. But, what about leveraging data for true discovery? Can new techniques in mathematics and physics help computer scientists create a new breed of programs to analyze datasets that traditional approaches cannot? How could our world change if we better understood the underlying mathematics behind the data? If finding insights within data is like finding a needle in a haystack, will the right math-based approaches help us build better magnets to draw out those needles? The conventional wisdom to date has been to apply these new techniques to the online world, where data is generated and stored in robust and zero-cost ways, but there is much, much more to explore.
While these are certainly big problems to tackle and will generate valuable insights for web properties to exploit, I’m most intrigued by the mathematicians and physicists who are innovating within their disciplines and applying them to tackle big problems around big data, particularly concerning the speed and shape of data. There are two aspects of data that capture my interest as a consumer. First, what are the speed and motion characteristics around the data generated, especially for networks that move in realtime, such as social networks and financial markets? Second, what is the shape of the data, and what can we learn from analyzing new dimensions within the data that perhaps weren’t accessible even just a year ago?
It’s within those fast-moving data and subsequent nooks and crannies that our next big discoveries may be hidden, waiting for new equations to unearth them. There are many public datasets (such as data.gov) available to scientists, some of which are listed here and here. There’s no shortage of opportunities to mine these resources, such as old public health studies, and to find new trends to inform the future. Perhaps just as interesting, if not more, is old data collected by large private companies and/or governments that are either too sensitive or competitive to release into the wild. Today, big pharmaceutical and biotechnology companies are sitting on mountains of internal data related to trials they’ve run, energy firms have data related to mineral and resource deposits, and finance speculators use the most sophisticated programs to run hedge funds and the like, looking for the smallest holes to exploit and extract gains.
Let’s assume this data was released, or at least made available to the best mathematicians out there today—could they help us sift through life science data and harvest information that could itself lead to the formation of entirely new products and services? Could they help us find new deposits of minerals, oil, or gas buried deep in the ground or remote parts of the ocean bed? Could the data help us target geoengineering tactics high up in the clouds to combat global warming? Could the data be used in financial markets, not only to notify us of fraudulent behavior, but also to prevent market movers from profiting during bubbles while the masses get doused after the bubbles pop? And, could we analyze seismic activity to predict earthquake likelihoods and tsunami arrival times? The folks and institutions that currently sit on this data have reasonable, short-term incentives to protect it given how competitive their industries are. Yet in the long-term, we’ll need to access these and other data, and hopefully allow entrepreneurs to probe them with all these new tools so, as Hammerbacher says, we can “use the past to impact the future.”
Yes, there is still much more value to extract from social commerce and interpersonal networks—but while these are worthwhile pursuits, the real game-changing innovation and advancement in big data will only come when we’re able to apply the most cutting-edge mathematics and physical sciences to the biggest problems we collectively face.
Image by Paul Butler.
Article courtesy of TechCrunch
One of my big predictions for 2011 is that we are going to start to see open databases for places spring up and take hold. Hyperpublic, which just launched today, is doing just that by creating an open database of people, places, and things tied to specific locations. “We are trying to structure the data in your local world,” says CEO and founder Jordan Cooper, who is also a partner at Lerer Ventures. In the video above, he gives me a quick demo of the service.
Hyperpublic has raised a $1.2 million seed round from Lerer Ventures, Ron Conway’s SV Angel, RRE Ventures, NextView Ventures, Hudson River Ventures, Thrive Capital, and Softbank.
You can upload any “object” to Hyperpublic as long as it has geo-coordinates. An object is simply a picture with tags and a location. These could be a restaurant on Second Avenue in New York City, used ski boots for sale on the Upper East Side, or a startup founder (with the location being where he works). The site was seeded with about 2,000 objects in New York City and San Francisco, and now anyone can populate it. Objects can be added via the website or simply by snapping a picture on your cell phone and emailing it to firstname.lastname@example.org once you have an account set up.
The idea is that you can set your location and then search for things around you. You can filter by places, people, or things, set the radius of your search, or search by tags. If you recognize someone you can send them a message, or clip any object to save it in your own collection.
The point of Hyperpublic is to “capture local experience, organize it, and display it,” says Jordan. He thinks of it as capturing the last mile of data on local objects—not just places, but things inside places. Stores and merchants could use it to show their products for sale. If you are looking for men’s suits within a 10-block radius, and enough stores or shoppers had uploaded and tagged suits in nearby stores, you could search for suits in the vicinity and see what was available as one example.
“People are going to start using the camera on this phone for utility instead of aesthetic value,” predicts Cooper. “More and more it will be used to capture information, and support their memory.” The success of Evernote supports that theory, but Hyperpublic wants to build a foundation for hundreds more apps like that.
The site today is step 1. Once the database begins to fill up, Jordan wants to open it up to anyone who wants to use it to pull data into other apps. Once the database is exposed via an API, Hyperpublic will join other location databases such as those from SimpleGeo and Factual as a resource for developers.
Article courtesy of TechCrunch
Factual, the open database company, closed a $25 million series A financing, led by Andreessen Horowitz and Index Ventures. VCs Ben Horowitz of AH and Danny Rimer of Index will be joining Factual’s board. Ron Conway’s SV Angel and former Hollywood agent Michael Ovitz also invested, as did some of the previous angels who put in about $2 million earlier this year (only half of which was previously disclosed). “The company has very significant aspirations,” says Rimer,” what they are seeking to do is extremely ambitious. We believe they will need a lot of funding.”
Factual started out as sort of a wiki for databases. Anyone can create or add data to Factual, and it has all sorts of APIs to make it easy for Websites and developers to build apps on top of the data. It also lets developers and consumers visualize this big data in all sorts of ways. But over the past few months, the company started to focus on a few key areas, especially local. It is also building datasets around healthcare, education, entertainment, and government.
But its big push right now is in local. It has a places database filled with the names, locations, addresses, phone numbers, and other info on 14 million businesses in the U.S. Geo apps like Booyah rely on this places database for their services. Factual’s biggest potential customer, however, is Facebook. Right now, Facebook Places in Japan and the UK is based at least partially on Factual data. If Factual can grow with Facebook Places, it has a chance to win the bigger business in the U.S. and elsewhere. It probably doesn’t hurt that Marc Andreesen sits on Facebook’s board.
Factual was founded by Gil Elbaz, who earlier co-founded Applied Semantics (that company was bought by Google for $100 million and became AdSense). He wants to build a big data company that creates and maintains valuable datasets that other Websites and developers can then build their apps on top of. Access to the data is very cheap or free at low volumes, but once an app starts to take off, Factual starts to charge data licensing fees based on how much data is used.
Article courtesy of TechCrunch