|
|
Created by Lenio Vogt
about 6 years ago
|
|
| Question | Answer |
| What is a Distributed System? | Is a collection of autonomous computing elements that appears to its users as a single coherent system |
| Autonomous computing elements | - consits of all kinds of nodes - nodes can act independently |
| Single coherent system | - where data is stored should be of no concern - infrastructure in the background is not visible for the user |
| Centralized/Decentralized/Distributed | |
| Middleware | - assist the development of distributed applications - separate layer of software that is logically placed on top of the respective operating system - data exchange between different OS |
| RPC | Remote Procedure Call - Communication Service - Allows application to invoke a function on a remote computer |
| Transactions | - applications make use of multiple services that are distributed among several computers - makes sure that every service is invoked, or none at all |
| Goals for Distributed Systems | - make resources easily accessible - hide the fact that resources are distributed across a network - open (components can be easily integrated) - be scalable |
| Types of Distributed Systems | - computing systems - information systems - pervasive systems |
| Computing Systems | - Cluster computing - Grid computing - Cloud computing |
| Cluster Computing | - Group of connected computers - Act like single entity --> High redundancy and distributed workload |
| Grid Computing | - Loosely coupled (decentralized) - Sharing tasks over multiple computers |
| Cloud computing | - Storing and accessing applications and data over the internet - Coupled (distributed) - Single system image |
| Information Systems | - Server running application and making it available to remote programmes (clients) - Clients send request --> Server sends response - Requests to different servers --> called distributed transaction |
| Pervasive Systems | - System is often equipped with many sensors that pick up various aspects - Small, battery-powered, mobile and having a wireless connection --> IoT |
| Internet of Things (IoT) | - Connecting all kinds of electronic devices to the internet - Benefits: Pick up data - Downsides: Data safety, rely too much on technology |
| Reasons for distributed data | - Scalability - Fault tolerance - High availability - Latency |
| Replication | - keeping a copy of the same data on serveral different nodes in different locations |
| Leader-based replication | |
| Leader-based replication | - one leader and followers, every write request goes to the leader, any read request by leader or any follower |
| Syncrhonous vs. Asynchronous Replication | |
| Handling node outages | - Follower: each follower keeps log data, after recovery he resynchronise with the leader - Leader: timeout -> leader failed, follower with most up-to-date data becomes new leader |
| Data loss | |
| Split Brain | |
| Timeout | |
| Read your own writes | |
| Monotonic reads | |
| Multi-Leader Replication | |
| Single vs Multileader | Performance Tolerance of outages Tolerances of network problems |
| Performance | Single: every write must go over internet to the leader Multi: every write can be processed by local datacenter |
| Tolerance of outages | Single: if leader fails, failover can promote a follower to be leader Multi: if leader fails, other leaders continue operating independetly |
| Tolerance of network problems | Single: very sensitive, because writes are made synchronously Multi: can tolerate temporary network problems |
| Leaderless Replication | |
| Partitioning | Splitting data into smaller subsets called partitions so that different partitions can be assigned to different nodes |
| Hotspot | Partition with disproportionately high load |
| Partitioning by Hash Key | - Takes skewed data and makes it uniformly distributed (Timestamp) Disadvantages: - losing property of key-range partitioning -> ability to do efficient range queries - keys that were once adjacent --> sort order is loss |
| Rebalancing | |
| Request routing | |
| Zoo Keeper | |
| Models of Data Flow | - via Databases - via service calls - asynchronous message passing |
| via Database | - data outlives the code process writes encoded data, another process reads it again sometime in the future - Migrating data is possible, but expensive on a large dataset |
| via Service Calls | |
| Service Calls - Web services | - when http is used as the underlying protocol for talking to the service -> web service |
| Service Calls - REST REpresentational State Transfer | - not a protocol - design philosophy - builds upon the principles of http - Using URL´s for identifying resources |
| Service Calls - SOAP Simple Object Access Protocol | - XML based for making network API requests |
| via Asynchronous Message Passing | - sender doesn´t wait for the message to be delivered - simply sends it and then forgets about it |
| SOA | Service oriented architecture: Decomposing large applications into smaller services by functionality |
| Message broker | - stores the messages temporarily - act as a buffer if recipient is unavailable - redeliver messages if crashes - allows sending message to different recipient |
Want to create your own Flashcards for free with GoConqr? Learn more.