A distributed database is a database in which portions of the database are stored in multiple physical locations and processing is distributed among multiple database nodes. Distributed databases can be homogenous or heterogeneous. In a homogenous distributed database system, all the physical locations have the same underlying hardware and run the same operating systems and database applications. In a heterogeneous distributed database, the hardware, operating systems, or database applications may be different at each of the locations.
Overview
A distributed database is a database distributed between several sites. The reasons for the data distribution may include the inherently distributed nature of the data or performance reasons. In a distributed database the data at each site is not necessarily an independent entity but can be rather related to the data stored on the other sites. A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (DDBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the user. A distributed database system (DDBS) is the integration of DDB and DDBMS. This integration is achieved through the merging of the database and networking technologies together [2].
A distributed database can reside on network servers on the Internet, on corporate intranets or extranets, or on other company networks. The replication and distribution of databases improves database performance at end-user worksites. To ensure that the distributive databases are up-to-date and current, there are two processes: replication and duplication. Replication involves using specialized software that looks for changes in the distributive database. Once the changes have been identified, the replication process makes all the databases look the same [3].
Database Management System
DBMS which is an integral and indispensable component is outsourced is attractive because energy, hardware, and DBaas (Database as a Service) are minimized. This survey deals with determining the workload for a multi-tenancy environment, elastic scalability, and an adjustable security scheme to run over encrypted data. This survey also studies the efficient and scalable ACID transactions in the cloud by decomposing functions of a database storage engine into transactional components and Data Components.
Figure 1: Distributed Database
It is important to understand the difference between distributed and decentralized databases. A decentralized database is moreover stored on computers at manifold locations; but, these computers are not connected by network and database software thus the data does not materialize to be in one logical database. Therefore, users at the dissimilar sites cannot admittance data. A decentralized database is mostly recognized as a collection of independent databases, not the geographical allotment of a single database. For dissimilar business situations, the use of disseminated databases is amplified:
- In modern associations business units Divisions, departments, and facilities are frequently organically dispersed, often across dissimilar countries. Every unit can create its possess information systems, and these units want local data more which they can include control.
- Data infrastructure costs and consistency The cost to ship big quantities of data transversely to an infrastructure network or to handle a large volume of communication from remote resources can still be high, even if data communication costs have reduced substantially recently. It is in various cases additional economical to locate data and request close to where they are required. Moreover, dependence on data infrastructure forever involves a component of risk, so keeping local copies or fragments of data can be a consistent method to support the requirement for rapid admittance of data transversely to the association.
- Database recovery- replicating data on divided computers is one strategy for certifying that a damaged database can be quickly improved and users can include admittance to data while the major site is organism restored. Replicating data transversely to multiple computer sites is one natural form of a disseminated database.
- Satisfying together operation and analytical dispensation the needs for database organization varies across OLTP and OLAP applications. Yet, similar data are in common among the two databases supporting every kind of application. Distributed database technology can be helpful in synchronizing data across OLTP and OLAP platforms.
References
[1] Joshi, Himanshu, and G. R. Bamnote, “Distributed database: A survey”, International Journal of Computer Science and Applications 6.2 (2013).
[2] Gupta, Swati, and Kuntal Saroha, “Fundamental research in distributed database”, IJCSMS 11.2 (2011).
[3] Stanchev, Lubomir, “Survey Paper for CS748T Distributed Database Management Lecturer” (2001).