As AWS technology evangelists, we often meet startups working on cool stuff. Every so often we discover startups that have done incredible things on AWS. Recently, I came across Navitas, a Berkeley-based company with development teams in Silicon Valley, Ecuador, and Thailand. Since I am deeply interested in location-based services and geo apps in the AWS, I dived a little deeper to learn more about the company and its architecture.
Navitas is the creator of TAPTIN, a location-based service similar to Foursquare and Gowalla. However, TAPTIN goes beyond mere check-ins. The TAPTIN platform enables the creation of locally branded apps, such as Berkeley Local, which has events and recommendations for UC Berkeley and the city of Berkeley. TAPTIN is thus a new form of local media, with built-in Foursquare-style check-in features as well as services for merchants to engage with their customers, such as through coupons, loyalty campaigns, etc. so you can build locally branded apps for every city around the world. Another example of an app built on the same platform is “We Love Beer” app. This beer app has a beer catalog, and pubs can link to the catalog categories to create their own beer menus. This app enables you to find what beers are available nearby, to locate a particular kind of beer, and to find your friends at local pubs.
Recently, Navitas abandoned their server farm and moved their entire development and production environments to AWS. It runs 100% in AWS cloud. TAPTIN is scaling on AWS across multiple tiers of servers. The founder of the company, Kevin Leong, was helpful in explaining their architecture in detail below.
What Kevin and his team have done is commendable, especially given that they did it by bootstrapping, which Kevin says would not have been possible without AWS.
The figure below depicts the Navitas production environment, which consists of seven scalable layers, all using open source enterprise technologies. Load balancers are employed in multiple tiers. Search is based on Solr, a popular open source search platform from the Apache Lucene project. Solr is also used for geospatial search. The search tier uses HAProxy on an Amazon EC2 instance to apply write updates to a Solr master server, and these updates are then distributed to the two Solr read-only slaves.
The application tiers consist of three layers. Web pages are implemented in PHP and consume REST APIs running on Jetty servers. Some PHP pages are also calling Solr directly. The company originally started with Enterprise Java Beans (EJB) running JBoss servers but then decided to use lightweight Java Persistence API (JPA) with Hibernate and Spring Framework HTTP Remoting. The caching layer runs memcached which provides dynamic load balancing cache services. They employ two layers of cache. First, they employ memcached that is deployed in the web tier. If an object is not found in memcached, it will be requested from the persistent tier, which for most recently used objects are probably in cache. This technique gives a higher performance. Memcached is configured to scale automatically with new servers.
While load balancing and automatic instance deployment ensures high availability as TAPTIN and Grand Poker apps scale, Kevin’s team also implemented a failover strategy, automatic data backup and implemented data recovery steps, as well as recovery of Solr search indexes. Because everything is done behind AWS, there is no bandwidth usage.
Navitas uses PostgreSQL on Amazon EC2 to store structured data with pgpool for load balancing, failover and asynchronous replication. It's very easy for them to add another instance to Pgpool to replicate to support load balancing and parallel queries.
Media, such as photos, are transcoded and stored in Amazon S3.
Sandbox and Development Environment
Having sandbox and source code repository (SVN) on AWS was not only cost-effective but also a huge productivity gain for the team as it was easy to launch another instance. With Amazon Machine Images (AMIs), developers create and launch a consistent environment for development, test and production environments. Kevin said that his developer team which is spread out around the world (in California, Latin America and Asia Pacific) can launch the same pre-configured sandboxes in that AWS region within minutes. This saved a lot of time and increased the developer productivity. The company uses spot instances for all development work, whenever available, which is cheaper.
They also create a new sandbox environment on AWS for testing. With SVN on Amazon EC2, Navitas does their nightly build in the cloud. Source code is checked out to a build directory where it’s compiled, built and deployed. Unit test hornets are also run to ensure no code breakage and to ensure performance of function is maintained. Kevin talked about automated performance testing will be coming later when the company has more resources.
With having an automated build on AWS, basically they were able to migrate their extreme programming development methodology to AWS by having developers commit their code daily for nightly build. They commit all development code to SVN trunk for all projects, and can build as required for testing in their sandbox environment. They create SVN branches for all production releases, allowing them to bugfix quickly and efficiently, and immediately release new application binaries, or plan and stage back-end upgrades.
The company maintains a series of build-scripts and uses Maven to manage the build dependencies. The server configuration is externalized so that the build scripts can pick up the appropriate configuration for sandbox and production. They create their sandbox which is a mirror of production and the bootstrapped AMI does all the magic. Cloud-powered SDLC clearly has lot of advantages.
What I really liked about Kevin's strategy was the "Think Scale" approach. Not many startups invest in designing a scalable architecture early on especially because its time-consuming and distracting. Some think it's too expensive. His message to Start-ups was "think scale" from the beginning as it is really not too expensive to do in the cloud. To quote him, "I did it by bootstrapping. They can also. Amazon AWS is the way to go."
- Jinesh Varia