Definition:
In theoretical computer science, the CAP Theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
It has been over twelve years since, Eric Brewer, then a scientist at University of California Berkeley, made the conjuncture which led to what we now universally acknowledge as CAP Theorem. But over these years, CAP theorem has changed the rules and proved to be one of the significant seeds on determining how a highly scalable and distributed computing platform can be built. Over these twelve years, this theorem has ended up as one of the primary read for anyone who is involved in building a distributed system.
The importance of CAP theorem is realized when the applications scale. For example, at low volume, delays in the transaction completion to ensure consistency is acceptable, but when the transaction volume increases, the trade-offs on latency to ensure consistency can have a significant impact on availability of the services and on the business as a whole.
Understanding and Implication:
To understand what this is all about, let us try to get a grasp on what the three attributes on which the theorem delves on in the world of distributed systems.
Consistency - This implies that all the users of the system, from where ever they are connecting from and to which our node of a distributed system they are connecting to, will get the same result for a particular query. It further implies that whenever one of the users updates a value in one of the nodes, all the other users of the system spread across the distributed nodes, will see the updated value.
Easiest way to understand this concept is to look at a real life example as an analogy. Let us say, that you are using the ATM to withdraw money from your bank account. Technically speaking, as soon as you have completed the withdrawal, your account will be updated and the updated information will be available to anyone who can check your account status, irrespective of whether they do it and from which part of the world they do it.
Availability: This implies that the system would ensure that the operations will continue even when some parts of the distributed systems are down either due to a planned maintenance or because of an error. The services to the users of the system, will not be impacted even when some parts of the system is down. Just want to stress the point, Availability is just not a protection against hardware failures, but also accounts for the software and ensures that factors like load balancing, load distribution, etc are accounted for.
Extending the previous analogy, you would expect that you would always be able to withdraw money from the ATM and need not be bothered whether all the backend systems connected to ATM is up and working full gear. You would regard the ATM service to be highly available, if and only if, that it is able to dispense cash whenever you want it.
Partition Tolerance: Invariably distributed systems are connected over the network. One of the ways to improve efficiency and utilization is to distribute the load across the nodes of the network. By partition tolerance, it is implied that, even when few of these nodes (or cluster of them) is down; the system is able to operate well.
Extending the ATM analogy further, let us model a case. Of the many models available for distributed system load management, one model that is widely used is Sharding. At a very broad level, the model proposes to group a set of entities together. In this example, let us say, all the Users of ATM from Bangalore are located on a particular node. When everything is stable, irrespective of the point of access, the users identified to be original belonging to Bangalore will be routed the Bangalore node. But partition tolerance is broken, when the link to the Bangalore node is broken, and because of this, a set of users (Bangalore based users) are not able to avail the ATM services.
What the CAP Theorem postulates is - while building a distributed system, two of the above three can be guaranteed, but it would not be possible to guarantee all the three attributes to the same level of reliability. The implication is that, it would be necessary to drop one of the three attributes, while building a highly scalable distributed system.
PS:
Update: Brewer did put up a new post on clarifying the 2 out of 3 rule that seem to be an inherent inference out of CAP Theorem and on strategies to handle partitions while ensuring Consistency and Availability. The note can be found here
References:
http://en.wikipedia.org/wiki/CAP_theorem
http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/
http://www.julianbrowne.com/article/viewer/brewers-cap-theorem/
ftp://ftp.compaq.com/pub/products/storageworks/whitepapers/5983-2544EN.pdf
http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed
In theoretical computer science, the CAP Theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
- Consistency (C)
- Availability (A)
- Partition Tolerance (P)
According to the theorem, a distributed system can satisfy any two of these guarantees at the same time, but not all the three.
(Reference: Wikipedia)
Relevance and Importance:
It has been over twelve years since, Eric Brewer, then a scientist at University of California Berkeley, made the conjuncture which led to what we now universally acknowledge as CAP Theorem. But over these years, CAP theorem has changed the rules and proved to be one of the significant seeds on determining how a highly scalable and distributed computing platform can be built. Over these twelve years, this theorem has ended up as one of the primary read for anyone who is involved in building a distributed system.
The importance of CAP theorem is realized when the applications scale. For example, at low volume, delays in the transaction completion to ensure consistency is acceptable, but when the transaction volume increases, the trade-offs on latency to ensure consistency can have a significant impact on availability of the services and on the business as a whole.
Understanding and Implication:
To understand what this is all about, let us try to get a grasp on what the three attributes on which the theorem delves on in the world of distributed systems.
Consistency - This implies that all the users of the system, from where ever they are connecting from and to which our node of a distributed system they are connecting to, will get the same result for a particular query. It further implies that whenever one of the users updates a value in one of the nodes, all the other users of the system spread across the distributed nodes, will see the updated value.
Easiest way to understand this concept is to look at a real life example as an analogy. Let us say, that you are using the ATM to withdraw money from your bank account. Technically speaking, as soon as you have completed the withdrawal, your account will be updated and the updated information will be available to anyone who can check your account status, irrespective of whether they do it and from which part of the world they do it.
Availability: This implies that the system would ensure that the operations will continue even when some parts of the distributed systems are down either due to a planned maintenance or because of an error. The services to the users of the system, will not be impacted even when some parts of the system is down. Just want to stress the point, Availability is just not a protection against hardware failures, but also accounts for the software and ensures that factors like load balancing, load distribution, etc are accounted for.
Extending the previous analogy, you would expect that you would always be able to withdraw money from the ATM and need not be bothered whether all the backend systems connected to ATM is up and working full gear. You would regard the ATM service to be highly available, if and only if, that it is able to dispense cash whenever you want it.
Partition Tolerance: Invariably distributed systems are connected over the network. One of the ways to improve efficiency and utilization is to distribute the load across the nodes of the network. By partition tolerance, it is implied that, even when few of these nodes (or cluster of them) is down; the system is able to operate well.
Extending the ATM analogy further, let us model a case. Of the many models available for distributed system load management, one model that is widely used is Sharding. At a very broad level, the model proposes to group a set of entities together. In this example, let us say, all the Users of ATM from Bangalore are located on a particular node. When everything is stable, irrespective of the point of access, the users identified to be original belonging to Bangalore will be routed the Bangalore node. But partition tolerance is broken, when the link to the Bangalore node is broken, and because of this, a set of users (Bangalore based users) are not able to avail the ATM services.
What the CAP Theorem postulates is - while building a distributed system, two of the above three can be guaranteed, but it would not be possible to guarantee all the three attributes to the same level of reliability. The implication is that, it would be necessary to drop one of the three attributes, while building a highly scalable distributed system.
PS:
Update: Brewer did put up a new post on clarifying the 2 out of 3 rule that seem to be an inherent inference out of CAP Theorem and on strategies to handle partitions while ensuring Consistency and Availability. The note can be found here
References:
http://en.wikipedia.org/wiki/CAP_theorem
http://www.royans.net/arch/brewers-cap-theorem-on-distributed-systems/
http://www.julianbrowne.com/article/viewer/brewers-cap-theorem/
ftp://ftp.compaq.com/pub/products/storageworks/whitepapers/5983-2544EN.pdf
http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed
Security Intelligence Solution provides one-click access to a comprehensive forensic trail and analytics in the same solution to simplify and accelerate threat discovery and incident investigation. To know more, visit Hadoop Training Bangalore
ReplyDeleteVery good information. we need learn from real time examples and for this we choose good training institute, who were interested to know about Big Data which is quite interesting. We need a good training institute for my learning .. so people making use of the free demo classes.
ReplyDeleteMany training institute provides free demo classes. One of the best training institute in Bangalore is Apponix Technologies.
https://www.apponix.com/Big-Data-Institute/hadoop-training-in-bangalore.html