Jonathan Gray, co-founder of Continuuity. Photo: Continuuity
Many developers dream of working for a high profile company like Google or Facebook. But not Jonathan Gray. The first time Facebook called and offered him a job, he turned them down.
“I told them no way, no how,” says Gray, who started his first company — a web hosting business that ran on the Linux open source operating system — when he was 13. “I’m really entrepreneurial. I didn’t want to be just a code monkey, a cog in a machine.”
But Facebook kept calling. Gray was a core developer on Hbase, an open source clone of Google’s famed BigTable data storage system, a means of storing data across hundreds or even thousands of machines. Facebook was migrating more of its applications — such as its messaging system — to Hbase, and it needed someone with Gray’s expertise to help. “Eventually, they made me an offer I couldn’t refuse,” he says.
As it happened, Gray ended up building a system for Facebook that meant it didn’t need him as much as it once did. He left the company, and now, he’s offering much the same system to the rest of the world, helping anyone juggle “big data” in much the same way Facebook does.
Back in 2011, if a company wanted to see how many people were “liking” its Facebook pages and other Facebook stuff, it had to make due with data that was as much as four days old, and that made it hard to tweak their campaigns in real-time. But Gray helped solve this problem. As part of an effort to provide users with real-time stats, he built a system called Puma. This helped funnel data between Facebook’s different storage systems, and more importantly, it provided a platform that developers could use to build data-driven applications without becoming experts in Hadoop and Hbase.
Now, Gray’s company, Continuuity, wants to give everyone a platform on par with Puma. Continuuity’s first product, Reactor — which comes out of beta test today — is designed to help developers build real-time applications that rely on data stored in Hadoop and Hbase, without having to learn the finer points of distributed computing or cluster administration.
“Developers shouldn’t have to learn all this stuff,” he says. “They should just be able to build their app.”
The idea behind Continuuity actually dates to the days before Gray joined Facebook. As a senior at Carnegie Mellon University, he co-founded a startup called Streamy. The company provided a service that was sort of like Google Reader, but with a recommendation engine driven by your social network. That meant working with large volumes of data. He discovered Hadoop and Hbase while running Streaming. “I was interviewing this someone and he said: ‘Hey, have you seen this thing Hadoop?’” Gray says. “We didn’t end up hiring him, but I’m really glad we interviewed him.”
Streamy wound up using Hadoop and Hbase extensively, and Gray became a core contributor to Hbase. After Streamy folded, he wanted to start a new company dedicated to the platform. But then Facebook came calling. “It worked out. The company I would have started wouldn’t have been the company I wanted to work in,” he says. “Hbase was still too immature. It would have been a pure open source company hammering Hbase into something that would work for enterprises.”
Instead, Continuuity focuses on making Hbase easier to use. One digital advertising company, Lotame, has already undergone a Facebook-style transformation. It once offered its customers analytics data that was hours old to data, and now, that data is seconds old. More importantly, Lotame’s developers who aren’t experts in Hadoop can now build data-drive applications and services.
And, just for fun, Gray and Continuuity co-founder Nitin Motgi and Streamy co-founder Don Mosites re-built Streamy using Reactor. Gray says it took them only two weeks — as opposed to several months — to recreate most of the core tools.
Gray likens Reactor to a platform-as-a-service — something along the lines of Heroku or Google App Engine, an online service for building and hosting software applications. But as of now, Reactor is only available for use inside your own data center — not as a public cloud service anyone can use. “Our long-term vision is to eventually go cloud,” he says. “But some customers will probably never go cloud, and some have gone cloud and then gone back on-prem because it’s too expensive.
“All the customers that we’ve seen, not just the Fortune 500, but startups as well, have brought Hadoop on-premise. The customers who are in the cloud generally don’t have scale or don’t have a true big data problem. But companies like Lotame once they reach scale they’re mostly on-premise.”