Networks of computers are everywhere. The Internet is one, as are the many networks of which it is composed. Mobile phone networks, corporate networks, factory networks, campus networks, home networks, in-car networks – all of these, both separately and in combination, share the essential characteristics that make them relevant subjects for study under the heading distributed systems. Distributed computing deals with all forms of computing, information access, and information exchange across multiple processing platforms connected by computer networks.
Overview
Over the past two decades, advancements in microelectronic technology have resulted in the availability of fast, inexpensive processors, and advancements in communication technology have resulted in the availability of cost-effective and highly efficient computer networks. The net result of the advancements in these two technologies is that the price-performance ratio has now changed to favor the use of interconnected multiple processors in place of a single, high-speed processor.
Distributed systems form a rapidly changing field of computer science. A distributed computer system consists of multiple software components that are on multiple computers but run as a single system. The computers that are in a distributed system can be physically close together and connected by a local network, or they can be geographically distant and connected by a wide area network. A distributed system can consist of any number of possible configurations, such as mainframes, personal computers, workstations, minicomputers, and so on. The goal of distributed computing is to make such a network work as a single computer. A distributed system is one in which components located at networked computers communicate and coordinate their actions only by passing messages. This definition leads to the following especially significant characteristics of distributed systems: concurrency of components, and lack of a global clock.
Figure 1: Infrastructure for Distributed System
Distributed System Features
A distributed system can be characterized as a collection of mostly autonomous processors communicating over a communication network and having the following features:
No Common Physical Clock: This is an important assumption because it introduces the element of “distribution” in the system and gives rise to the inherent asynchrony amongst the processors.
No Shared Memory: This is a key feature that requires message-passing for communication. This feature implies the absence of the common physical clock.
Geographical Separation: The geographically wider apart that the processors are, the more representative is the system of a distributed system. However, it is not necessary for the processors to be on a wide-area network (WAN). Recently, the network/cluster of workstations (NOW/COW) configuration connecting processors on a LAN is also being increasingly regarded as a small distributed system. This NOW configuration is becoming popular because of the low-cost high-speed off-the-shelf processors now available. The Google search engine is based on the NOW architecture.
Autonomy and Heterogeneity: The processors are “loosely coupled” in that they have different speeds and each can be running a different operating system. They are usually not part of a dedicated system but cooperate with one another by offering services or solving a problem jointly.
Challenges for a Distributed System
Designing a distributed system does not come as easy and straightforward. A number of challenges need to be overcome in order to get the ideal system. The major challenges in distributed systems are listed below:
Figure 2: Overview of Challenges
Heterogeneity: The Internet enables users to access services and run applications over a heterogeneous collection of computers and networks. Heterogeneity (that is, variety and difference) applies to all of the following:
- Hardware Devices: computers, tablets, mobile phones, embedded devices, etc.
- Operating System: MS Windows, Linux, Mac, Unix, etc.
- Networks: Local network, the Internet, wireless network, satellite links, etc.
- Programming Languages: Java, C/C++, Python, PHP, etc.
- Different roles of software developers, designers, system managers
Transparency: Transparency is defined as the concealment from the user and the application programmer of the separation of components in a distributed system so that the system is perceived as a whole rather than as a collection of independent components. In other words, distributed systems designers must hide the complexity of the systems as much as they can. Some terms of transparency in distributed systems are:
- Access: Hide differences in data representation and how a resource is accessed
- Location: Hide where a resource is located
- Migration: Hide that a resource may move to another location
- Relocation: Hide that a resource may be moved to another location while in use
- Replication: Hide that a resource may be copied in several places
- Concurrency: Hide that a resource may be shared by several competitive users
- Failure: Hide the failure and recovery of a resource
- Persistence: Hide whether a (software) resource is in memory or a disk
Openness: The openness of a computer system is the characteristic that determines whether the system can be extended and reimplemented in various ways. The openness of distributed systems is determined primarily by the degree to which new resource-sharing services can be added and made available for use by a variety of client programs. If the well-defined interfaces for a system are published, it is easier for developers to add new features or replace sub-systems in the future. Example: Twitter and Facebook have API that allows developers to develop their own software interactively.
Concurrency: Both services and applications provide resources that can be shared by clients in a distributed system. There is therefore a possibility that several clients will attempt to access a shared resource at the same time. For example, a data structure that records bids for an auction may be accessed very frequently when it gets close to the deadline time. For an object to be safe in a concurrent environment, its operations must be synchronized in such a way that its data remains consistent. This can be achieved by standard techniques such as semaphores, which are used in most operating systems.
Security: Many of the information resources that are made available and maintained in distributed systems have a high intrinsic value to their users. Their security is therefore of considerable importance. Security for information resources has three components:
- Confidentiality (protection against disclosure to unauthorized individuals)
- Integrity (protection against alteration or corruption),
- Availability for the authorized (protection against interference with the means to access the resources).
Scalability: Distributed systems must be scalable as the number of users increases. A system is said to be scalable if it can handle the addition of users and resources without suffering a noticeable loss of performance or increase in administrative complexity
Scalability has 3 dimensions:
Size: Number of users and resources to be processed. The problem associated is overloading
Geography: Distance between users and resources. The problem associated is communication reliability
Administration: As the size of distributed systems increases, many of the systems need to be controlled. The problem associated is an administrative mess
Failure Handling: Computer systems sometimes fail. When faults occur in hardware or software, programs may produce incorrect results or may stop before they have completed the intended computation. The handling of failures is particularly difficult.
References
[1] Kshemkalyani, Ajay D., and Mukesh Singhal, “Distributed computing: principles, algorithms, and systems”, Cambridge University Press, 2011.
[2] George Coulouris and Jean Dollimore, “Distributed Systems: Concepts and Design”, Pearson education, 2005.
[3] S.G. Bhagwath and Dr. Mallikarjun Math, “Distributed Systems and Recent Innovations: Challenges Benefits and Security Issues in Distributed Systems”, Bonfring International Journal of Software Engineering and Soft Computing, Vol. 6, Special Issue, October 2016
[4] Nadiminti, Krishna and Rajkumar Buyya, “Distributed systems and recent innovations: Challenges and benefits”, InfoNet Magazine 16.3 (2006): 1-5.