What is scalability in nut shell?
Now that you are filling the gas without making anybody wait you became famous and everybody wants to fill gas at your station, so there are 120 cars coming to your station per hour, so one has to wait at least a minute to start filling, which means you have a latency of 1 minute, so you are not serving the requests properly. To solve this problem you’ve added one more pump, so that nobody has to wait and 120 cars can be filled per/hr, which is horizontal scaling in software terminology. Now that you can server 120 cars as well it has increased to 500 cars, assume that you cannot afford to increase the pumps which means you need to make the existing two pumps better so that they can server 500 cars in given time, so you have to increase the capacity of the pumps such that it can fill the tank in 15 secs instead of 1 min (how many requests you can server per/sec is called throughput (TPS)), which is vertical scaling in software terminology.
So in literal terms, scalability is, component you built should be able to expand itself to serve the increasing needs of load, and while it’s expanding, it should require minimal changes.
The pump mentioned in the above example can be any of your software components like web server, storage and API etc.
What is an API?
An API is a software component which provides a means to interact with an application. If you have a data rich application, you need an interface to expose its functionality which we are calling an API. For example google has maps application which can give you route info if you provide the coordinates, here coordinates are input for your API and route info is the output.
Designing scalable APIS
Check Philip’s blog for the basics of REST API Design, the three things you need to remember before designing your API are, they should be self-explanatory or you must have proper documentation (Swagger, RAML, etc.), API comes first then the implementation stress more on design than implementing it, it’s the interface to your application, clients could be end-user, your PC app or mobile app.
Tooling is really important to make the APIs data-driven while you iterate on them with the feedback, tools like Amazon API gateway or apigee API portal will accelerate your API deployment while taking care of their security and providing you the statistics, pre/post processing for your APIs if required. For example same API might have to provide more details if it is invoked by mobile app client as often we have to show more data on small screen.
While choosing the right language is up to the application requirements, latest frameworks like spring-boot in java can really accelerate the delivery, while making it extremely easy to push the code to containers like Docker.
Micro-services approach: while I don’t want to discuss much about this, It’s now almost implicit that if you are not following this architecture, you are risking the scalability of your application in near future while making it really difficult to make a single release. Micro services have their dedicated resources and are independent in terms of development and deployment. They can still depend on other micro-services but the less the dependencies are the better you orchestrated them. In cloud, often server instances are optimised for memory, computation usage etc., if you have micro-services then you can choose the relevant instance based on the service requirements.
Hystrix is one of the best frameworks to make the micro-services latency and fault tolerant. While resolving the dependencies, if one of the micro-services fail, that could bring down whole system. Hystrix will take care of the network latencies, retry and fall back mechanisms. It makes the services much more reliable.
ELK stack: trouble shooting will be a nightmare if you don’t have proper monitoring in place, irrespective of if you are on premise or on the cloud, ELK is a must for basic monitoring of APIs or applications.
Event Sourcing: CQRS (Command Query Responsibility Segregation) idea behind this is using different models for reading and writing. You will choose the eventual consistency here as you will respond before even the event is completed. When the event is published eventually all the subscribers get updated, if something goes wrong we can publish the event again. CQRS may or may not fir in the architecture based on your application, careful investigation needs to be done before introducing it or it could add complexity.
Scaling the applications and APIS in the cloud
No single point failures: First things first you cannot have only one of any component in the architecture, you must have at least two components each irrespective of load.
Content Delivery Networks: CDNs are required in most of the cases for applications, they help caching the static data, let it be HTML pages or API response, it drastically reduces the load on the server. Cloud Front (AWS), NGINX, AKAMAI some CDN examples.
Load Balancers: If you have multiple instances running, you need to split the load between the instances, if an instance is throwing errors continuously you could stop sending the requests to that instance split the load among other instances. Ex: Elastic Load Balancer, F5.
Choosing Cloud Instances: In cloud the instances are generally optimised for computation power, memory etc., so choose an instance that fits your requirements. Once you choose the instance you can configure to auto scale the instances based on the load/requests. On the days like thanksgiving your website might need additional capacity, you can configure this using a lambda also to add and delete the instances.
cf scale APP -i INSTANCES → (cloud foundry) this command can increase the instances, so it’s that simple to add servers.
Containers: Docker is already a well-known container if haven’t heard of it then you are missing out a lot, within minutes you can create/delete Docker instances on you Amazon EC2 instance or Google Cloud compute machine, once you have a Docker instance running you can just run/deploy your packaged code on Docker. You don’t have to worry about the platform dependency or security.
Kubenates: If you have containers then you need Kubenates to make them self-healing, auto-scaling, service-discovery, load-balancing, roll outs, roll backs, storage orchestration etc.
Database Bottlenecks: Don’t let database be a bottle neck when you did so much in making the API/application better. If your design suits a NoSQL database then it’s better to get rid of the SQL databases. If you are dealing with documents go with mongoDB, if you are dealing with key-value stores go with Redis or DynamoDB etc.
If you have to deal with SQL database then note these points, use stored procedures, indexing needs to done properly, if you have bulk update you can drop the constraints before doing that.
Read optimisation can be done using the cache servers or horizontal scaling, write optimisation can be done using the database sharding/partitioning.