What is Distributed System

Networks of computers are everywhere. The Internet is one, as are the many networks of which it is composed. Mobile phone networks, corporate networks, factory networks, campus networks, home networks, in-car networks – all of these, both separately and in combination, share the essential characteristics that make them relevant subjects for study under the heading distributed systems. Distributed computing deals with all forms of computing, information access, and information exchange across multiple processing platforms connected by computer networks.


Over the past two decades, advancements in microelectronic technology have resulted in the availability of fast, inexpensive processors, and advancements in communication technology have resulted in the availability of cost-effective and highly efficient computer networks. The net result of the advancements in these two technologies is that the price-performance ratio has now changed to favor the use of interconnected multiple processors in place of a single, high-speed processor.

Distributed systems form a rapidly changing field of computer science. A distributed computer system consists of multiple software components that are on multiple computers but run as a single system. The computers that are in a distributed system can be physically close together and connected by a local network, or they can be geographically distant and connected by a wide area network. A distributed system can consist of any number of possible configurations, such as mainframes, personal computers, workstations, minicomputers, and so on. The goal of distributed computing is to make such a network work as a single computer. A distributed system is one in which components located at networked computers communicate and coordinate their actions only by passing messages. This definition leads to the following especially significant characteristics of distributed systems: concurrency of components, and lack of a global clock.

Figure 1: Infrastructure for Distributed System

Distributed System Features

A distributed system can be characterized as a collection of mostly autonomous processors communicating over a communication network and having the following features:

No Common Physical Clock: This is an important assumption because it introduces the element of “distribution” in the system and gives rise to the inherent asynchrony amongst the processors.

No Shared Memory: This is a key feature that requires message-passing for communication. This feature implies the absence of the common physical clock.

Geographical Separation: The geographically wider apart that the processors are, the more representative is the system of a distributed system. However, it is not necessary for the processors to be on a wide-area network (WAN). Recently, the network/cluster of workstations (NOW/COW) configuration connecting processors on a LAN is also being increasingly regarded as a small distributed system. This NOW configuration is becoming popular because of the low-cost high-speed off-the-shelf processors now available. The Google search engine is based on the NOW architecture.

Autonomy and Heterogeneity: The processors are “loosely coupled” in that they have different speeds and each can be running a different operating system. They are usually not part of a dedicated system but cooperate with one another by offering services or solving a problem jointly.

Challenges for a Distributed System

Designing a distributed system does not come as easy and straightforward. A number of challenges need to be overcome in order to get the ideal system. The major challenges in distributed systems are listed below:

Figure 2: Overview of Challenges

Heterogeneity:  The Internet enables users to access services and run applications over a heterogeneous collection of computers and networks. Heterogeneity (that is, variety and difference) applies to all of the following:

  • Hardware Devices: computers, tablets, mobile phones, embedded devices, etc.
  • Operating System: MS Windows, Linux, Mac, Unix, etc.
  • Networks: Local network, the Internet, wireless network, satellite links, etc.
  • Programming Languages: Java, C/C++, Python, PHP, etc.
  • Different roles of software developers, designers, system managers

Transparency: Transparency is defined as the concealment from the user and the application programmer of the separation of components in a distributed system so that the system is perceived as a whole rather than as a collection of independent components. In other words, distributed systems designers must hide the complexity of the systems as much as they can.  Some terms of transparency in distributed systems are:

  • Access: Hide differences in data representation and how a resource is accessed
  • Location: Hide where a resource is located
  • Migration: Hide that a resource may move to another location
  • Relocation: Hide that a resource may be moved to another location while in use
  • Replication: Hide that a resource may be copied in several places
  • Concurrency: Hide that a resource may be shared by several competitive users
  • Failure: Hide the failure and recovery of a resource
  • Persistence: Hide whether a (software) resource is in memory or a disk

Openness: The openness of a computer system is the characteristic that determines whether the system can be extended and reimplemented in various ways. The openness of distributed systems is determined primarily by the degree to which new resource-sharing services can be added and made available for use by a variety of client programs. If the well-defined interfaces for a system are published, it is easier for developers to add new features or replace sub-systems in the future. Example: Twitter and Facebook have API that allows developers to develop their own software interactively.

Concurrency: Both services and applications provide resources that can be shared by clients in a distributed system. There is therefore a possibility that several clients will attempt to access a shared resource at the same time. For example, a data structure that records bids for an auction may be accessed very frequently when it gets close to the deadline time. For an object to be safe in a concurrent environment, its operations must be synchronized in such a way that its data remains consistent. This can be achieved by standard techniques such as semaphores, which are used in most operating systems.

Security: Many of the information resources that are made available and maintained in distributed systems have a high intrinsic value to their users. Their security is therefore of considerable importance. Security for information resources has three components:

  • Confidentiality (protection against disclosure to unauthorized individuals)
  • Integrity (protection against alteration or corruption),
  • Availability for the authorized (protection against interference with the means to access the resources).

Scalability: Distributed systems must be scalable as the number of users increases. A system is said to be scalable if it can handle the addition of users and resources without suffering a noticeable loss of performance or increase in administrative complexity

Scalability has 3 dimensions:

Size: Number of users and resources to be processed. The problem associated is overloading

Geography: Distance between users and resources. The problem associated is communication reliability

Administration: As the size of distributed systems increases, many of the systems need to be controlled. The problem associated is an administrative mess

Failure Handling: Computer systems sometimes fail. When faults occur in hardware or software, programs may produce incorrect results or may stop before they have completed the intended computation. The handling of failures is particularly difficult.


[1] Kshemkalyani, Ajay D., and Mukesh Singhal, “Distributed computing: principles, algorithms, and systems”, Cambridge University Press, 2011.

[2] George Coulouris and Jean Dollimore, “Distributed Systems: Concepts and Design”, Pearson education, 2005.

[3] S.G. Bhagwath and Dr. Mallikarjun Math, “Distributed Systems and Recent Innovations: Challenges Benefits and Security Issues in Distributed Systems”, Bonfring International Journal of Software Engineering and Soft Computing, Vol. 6, Special Issue, October 2016

[4] Nadiminti, Krishna and Rajkumar Buyya, “Distributed systems and recent innovations: Challenges and benefits”, InfoNet Magazine 16.3 (2006): 1-5.


What is Distributed Database

A distributed database is a database in which portions of the database are stored in multiple physical locations and processing is distributed among multiple database nodes. Distributed databases can be homogenous or heterogeneous. In a homogenous distributed database system, all the physical locations have the same underlying hardware and run the same operating systems and database applications. In a heterogeneous distributed database, the hardware, operating systems, or database applications may be different at each of the locations.


A distributed database is a database distributed between several sites. The reasons for the data distribution may include the inherently distributed nature of the data or performance reasons. In a distributed database the data at each site is not necessarily an independent entity but can be rather related to the data stored on the other sites.  A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (DDBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the user. A distributed database system (DDBS) is the integration of DDB and DDBMS. This integration is achieved through the merging of the database and networking technologies together [2].

A distributed database can reside on network servers on the Internet, on corporate intranets or extranets, or on other company networks. The replication and distribution of databases improves database performance at end-user worksites. To ensure that the distributive databases are up-to-date and current, there are two processes: replication and duplication. Replication involves using specialized software that looks for changes in the distributive database. Once the changes have been identified, the replication process makes all the databases look the same [3].

Database Management System

DBMS which is an integral and indispensable component is outsourced is attractive because energy, hardware, and DBaas (Database as a Service) are minimized. This survey deals with determining the workload for a multi-tenancy environment, elastic scalability, and an adjustable security scheme to run over encrypted data. This survey also studies the efficient and scalable ACID transactions in the cloud by decomposing functions of a database storage engine into transactional components and Data Components.

Figure 1: Distributed Database

It is important to understand the difference between distributed and decentralized databases. A decentralized database is moreover stored on computers at manifold locations; but, these computers are not connected by network and database software thus the data does not materialize to be in one logical database. Therefore, users at the dissimilar sites cannot admittance data. A decentralized database is mostly recognized as a collection of independent databases, not the geographical allotment of a single database. For dissimilar business situations, the use of disseminated databases is amplified:

  • In modern associations business units Divisions, departments, and facilities are frequently organically dispersed, often across dissimilar countries. Every unit can create its possess information systems, and these units want local data more which they can include control.
  • Data infrastructure costs and consistency The cost to ship big quantities of data transversely to an infrastructure network or to handle a large volume of communication from remote resources can still be high, even if data communication costs have reduced substantially recently. It is in various cases additional economical to locate data and request close to where they are required. Moreover, dependence on data infrastructure forever involves a component of risk, so keeping local copies or fragments of data can be a consistent method to support the requirement for rapid admittance of data transversely to the association.
  • Database recovery- replicating data on divided computers is one strategy for certifying that a damaged database can be quickly improved and users can include admittance to data while the major site is organism restored. Replicating data transversely to multiple computer sites is one natural form of a disseminated database.
  • Satisfying together operation and analytical dispensation the needs for database organization varies across OLTP and OLAP applications. Yet, similar data are in common among the two databases supporting every kind of application. Distributed database technology can be helpful in synchronizing data across OLTP and OLAP platforms.


[1] Joshi, Himanshu, and G. R. Bamnote, “Distributed database: A survey”, International Journal of Computer Science and Applications 6.2 (2013).

[2] Gupta, Swati, and Kuntal Saroha, “Fundamental research in distributed database”, IJCSMS 11.2 (2011).

[3] Stanchev, Lubomir, “Survey Paper for CS748T Distributed Database Management Lecturer” (2001).

what is cyber security and why it is required

The term cyber security is often used interchangeably with the term information security. Cyber security is the activity of protecting information and information systems (networks, computers, databases, data centers, and applications) with appropriate procedural and technological security measures. Cybersecurity has become a matter of global interest and importance. It refers to a set of techniques used to protect the integrity of networks, programs, and data from attack, damage, or unauthorized access.


Cyber security is the collection of tools, policies, security concepts, security safeguards, guidelines, risk management approaches, actions, training, best practices, assurance, and technologies that can be used to protect the cyber environment and organization and user’s assets. Organization and user assets include connected computing devices, personnel, infrastructure, applications, services, telecommunications systems, and the totality of transmitted and/or stored information in the cyber environment. It strives to ensure the attainment and maintenance of the security properties of the organization and user’s assets against relevant security risks in the cyber environment.

Cyber security refers to the body of technologies, processes, and practices designed to protect networks, devices, programs, and data from attack, damage, or unauthorized access. Cyber security may also be referred to as information technology security.”

Why it is required?

The core functionality involves protecting information and systems from major cyber threats. These cyber threats take many forms (e.g., application attacks, malware, ransomware, phishing, and exploit kits). Unfortunately, cyber adversaries have learned to launch automated and sophisticated attacks using these tactics – at lower and lower costs. As a result, keeping pace with security strategy and operations can be a challenge, particularly in government and enterprise networks where, in their most disruptive form, cyber threats often take aim at secret, political, military, or infrastructural assets of a nation, or its people. Some of the common threats are outlined below in detail.

  • Cyberterrorism is the disruptive use of information technology by terrorist groups to further their ideological or political agenda. This takes the form of attacks on networks, computer systems, and telecommunication infrastructures.
  • Cyber warfare involves nation-states using information technology to penetrate another nation’s networks to cause damage or disruption. In the U.S. and many other nations, cyber warfare has been acknowledged as the fifth domain of warfare (following land, sea, air, and space). Cyber warfare attacks are primarily executed by hackers who are well-trained in exploiting the intricacies of computer networks and operate under the auspices and support of nation-states. Rather than “shutting down” a target’s key networks, a cyber warfare attack may intrude into networks to compromise valuable data, degrade communications, impair such infrastructural services as transportation and medical services, or interrupt commerce.
  • Cyber espionage is the practice of using information technology to obtain secret information without permission from its owners or holders. Cyber espionage is most often used to gain strategic, economic, political, or military advantage, and is conducted using cracking techniques and malware

Types of cyber security threats

Ransomware:  Ransomware is a type of malicious software. It is designed to extort money by blocking access to files or the computer system until the ransom is paid. Paying the ransom does not guarantee that the files will be recovered or the system restored.

Malware: Malware is a type of software designed to gain unauthorized access or to cause damage to a computer.

Social engineering: Social engineering is a tactic that adversaries use to trick you into revealing sensitive information. They can solicit a monetary payment or gain access to your confidential data. Social engineering can be combined with any of the threats listed above to make you more likely to click on links, download malware, or trust a malicious source.

Phishing: Phishing is the practice of sending fraudulent emails that resemble emails from reputable sources. The aim is to steal sensitive data like credit card numbers and login information. It’s the most common type of cyber attack. You can help protect yourself through education or a technology solution that filters malicious emails.

Figure 1: Layered View of Cyber Security Framework

Cyber Security Trends

  • It regulations improvement
  • Data theft turning into data manipulation
  • Demand will continue to rise for security skills
  • Security in the Internet of Things (IoT)
  • Attackers will target consumer devices
  • Attackers will become bolder, more commercial less traceable
  • Cyber risk insurance will become more common
  • New job titles appearing – CCO (chief cybercrime officer)


[1] Atul M. Tonge and Suraj S. Kasture, “Cyber security: challenges for society- literature review”, IOSR Journal of Computer Engineering (IOSR-JCE), Volume 12, Issue 2 (May. – Jun. 2013), pp. 67-75

[2] “What is Cyber security? A Definition of Cyber security”, available online at: https://www.paloaltonetworks.com/cyberpedia/what-is-cyber-security

[3] “What Is Cyber security?”, https://www.cisco.com/c/en/us/products/security/what-is-cybersecurity.html

[4] Rossouw von Solms and Johan van Niekerk, “From information security to cyber security”, computers & security 38 (2013), pp. 97-102

Understanding of Data deduplication in Cloud

Rendering efficient storage and security for all data is very important for the cloud. With the rapidly increasing amounts of data produced worldwide, networked and multi-user storage systems are becoming very popular. However, concerns over data security still prevent many users from migrating data to remote storage.

Data deduplication refers to a technique for eliminating redundant data in a data set. In the process of deduplication, extra copies of the same data are deleted, leaving only one copy to be stored. Data is analyzed to identify duplicate byte patterns to ensure the single instance is indeed the single file. Then, duplicates are replaced with a reference that points to the stored chunk.

Data deduplication is a technique to reduce storage space. By identifying redundant data using hash values to compare data chunks, storing only one copy, and creating logical pointers to other copies instead of storing other actual copies of the redundant data. Deduplication reduces data volume so disk space and network bandwidth can be reduced which reduces costs and energy consumption for running storage systems.

Figure 1 Data de-duplication View

It is a technique whose objective is to improve storage efficiency. With the aim to reduce storage space, in traditional deduplication systems, duplicated data chunks identify and store only one replica of the data in storage. Logical pointers are created for other copies instead of storing redundant data. Deduplication can reduce both storage space and network bandwidth. However such techniques can result in a negative impact on system fault tolerance. Because there are many files that refer to the same data chunk, if it becomes unavailable due to failure can result in reduced reliability. Due to this problem, many approaches and techniques have been proposed that not only provide solutions to achieve storage efficiency but also improve fault tolerance.


Data deduplication provides practical ways to achieve these goals, including

  • Capacity optimization. It stores more data in less physical space. It achieves greater storage efficiency than was possible by using features such as Single Instance Storage (SIS) or NTFS compression. It uses subfile variable-size chunking and compression, which deliver optimization ratios of 2:1 for general file servers and up to 20:1 for virtualization data.
  • Scale and performance. It is highly scalable, resource-efficient, and nonintrusive. It can process up to 50 MB per second in Windows Server 2012 R2, and about 20 MB of data per second in Windows Server 2012. It can run on multiple volumes simultaneously without affecting other workloads on the server.
  • Reliability and data integrity. When it is applied, the integrity of the data is maintained. Data Deduplication uses checksum, consistency, and identity validation to ensure data integrity. For all metadata and the most frequently referenced data, data deduplication maintains redundancy to ensure that the data is recoverable in the event of data corruption.
  • Bandwidth efficiency with BranchCache.Through integration with BranchCache, the same optimization techniques are applied to data transferred over the WAN to a branch office. The result is faster file download times and reduced bandwidth consumption.
  • Optimization management with familiar tools. It has optimization functionality built into Server Manager and Windows PowerShell. Default settings can provide savings immediately, or administrators can fine-tune the settings to see more gains.

Data de-duplication Methods

Data deduplication identifies duplicate data, removing redundancies and reducing the overall capacity of data transferred and stored. There are two methods Block-level and byte-level data deduplication methods deliver the benefit of optimizing storage capacity. When, where, and how the processes work should be reviewed for your data backup environment and its specific requirements before selecting one approach over another.

  1. Block-level Approaches

Block-level data deduplication segments data streams into blocks, inspecting the blocks to determine if each has been encountered before (typically by generating a digital signature or unique identifier via a hash algorithm for each block). If the block is unique, it is written to disk, and its unique identifier is stored in an index; otherwise, only a pointer to the original, unique block is stored. By replacing repeated blocks with much smaller pointers rather than storing the block again, disk storage space is saved.

  1. Byte-level data de-duplication

Analyzing data streams at the byte level is another approach to deduplication. By performing a byte-by-byte comparison of new data streams versus previously stored ones, a higher level of accuracy can be delivered. Deduplication products that use this method have one thing in common: It’s likely that the incoming backup data stream has been seen before, so it is reviewed to see if it matches similar data received in the past


[1] How Does Data Deduplication Work? Online available at: http://www.enterprisestorageguide.com/how-data-deduplication-works

[2] “Data Deduplication Overview”, available online at: https://technet.microsoft.com/en-us/library/hh831602(v=ws.11).aspx

[3] Leesakul, Waraporn, Paul Townend, and Jie Xu. “Dynamic data deduplication in cloud storage.” Service Oriented System Engineering (SOSE), 2014 IEEE 8th International Symposium on, IEEE, 2014.

What is Data Compression

Compression is used just about everywhere. Data compression involves the development of a compact representation of information. Most representations of information contain large amounts of redundancy. Redundancy can exist in various forms. Internet users who download or upload files from/to the web, or use email to send or receive attachments will most likely have encountered files in compressed format.

General Overview

With the extended use of computers in various disciplines, the number of data processing applications is also increasing which requires the processing and storing of large volumes of data. It is primarily a branch of information theory, which deals with techniques related to minimizing the amount of data to be transmitted and stored. It is often called coding, where coding is a general term encompassing any special representation of data that satisfies a given need. Information theory is the study of efficient coding and its consequences, in the form of speed.

What is it?

Today, with the growing demands of information storage and data transfer, data compression is becoming increasingly important. Compression is the process of encoding data more efficiently to reduce file size. One type of compression is available is referred to as lossless compression. This means the compressed file will be restored exactly to its original state with no loss of data during the decompression process. This is essential to data compression as the file would be corrupted and unusable should data be lost. It is the art of reducing the number of bits needed to store or transmit data. It is one of the enabling technologies for multimedia applications. It would not be practical to put images, audio, and video on websites if they do not use compression algorithms. Mobile phones would not be able to provide communication clearly without data compression. With compression techniques, we can reduce the consumption of resources, such as hard disk space or transmission bandwidth.

Data Compression Principles

Below, data compression principles are listed:

  • It is the substitution of frequently occurring data items, or symbols, with shortcodes that require fewer bits of storage than the original symbol.
  • Saves space, but requires time to save and extract.
  • Success varies with the type of data.
  • Works best on data with low spatial variability and limited possible values.
  • Works poorly with high spatial variability data or continuous surfaces.
  • Exploits inherent redundancy and irrelevancy by transforming a data file into a smaller one

Figure 1: Data Compression Process

Data Compression Technique

Data compression is the function of the presentation layer in the OSI reference model. Compression is often used to maximize the use of bandwidth across a network or to optimize disk space when saving data.

There are two general types of compression techniques:

Figure 2: Classification of Compression

Lossless Compression

Lossless compression compresses the data in such a way that when data is decompressed it is exactly the same as it was before compression i.e. there is no loss of data. Lossless compression is used to compress file data such as executable code, text files, and numeric data because programs that process such file data cannot tolerate mistakes in the data. Lossless compression will typically not compress files as much as lossy compression techniques and may take more processing power to accomplish the compression.

Lossless data compression is compression without any loss of data quality. The decompressed file is an exact replica of the original one. Lossless compression is used when it is important that the original and the decompressed data be identical. It is done by re-writing the data in a more space-efficient way, removing all kinds of repetitions (compression ratio 2:1). Some image file formats, notably PNG, use only lossless compression, while those like TIFF may use either lossless or lossy methods.

Lossless Compression Algorithms

The various algorithms used to implement lossless data compression are:

Run Length Encoding

  • This method replaces the consecutive occurrences of a given symbol with only one copy of the symbol along with a count of how many times that symbol occurs. Hence the name ‘run length’.
  • For example, the string AAABBCDDDD would be encoded as 3A2BIC4D.
  • A real-life example where run-length encoding is quite effective is the fax machine. Most faxes are white sheets with the occasional black text. So, a run-length encoding scheme can take each line and transmit a code for while then the number of pixels, then the code for black and the number of pixels, and so on.
  • This method of compression must be used carefully. If there is not a lot of repetition in the data then it is possible the run length encoding scheme would actually increase the size of a file.

Differential Pulse Code Modulation

  • In this method first, a reference symbol is placed. Then for each symbol in the data, we place the difference between that symbol and the reference symbol used.
  • For example, using symbol A as the reference symbol, the string AAABBC DDDD would be encoded as AOOOl123333, since A is the same as the reference symbol, B has a difference of 1 from the reference symbol, and so on.

Dictionary Based Encoding

  • One of the best-known dictionary-based encoding algorithms is Lempel-Ziv (LZ) compression algorithm.
  • This method is also known as substitution coder.
  • In this method, a dictionary (table) of variable-length strings (common phrases) is built.
  • This dictionary contains almost every string that is expected to occur in data.
  • When any of these strings occur in the data, then they are replaced with the corresponding index to the dictionary.
  • In this method, instead of working with individual characters in text data, we treat each word as a string and output the index in the dictionary for that word.
  • For example, let us say that the word “compression” has an index of 4978 in one particular dictionary; it is the 4978th word is usr/share/dict/words. To compress a body of text, each time the string “compression” appears, it would be replaced by 4978.

Lossy Compression

A lossy compression method is one where compressing data and then decompressing it retrieves data that may well be different from the original, but is “close enough” to be useful in some way. The algorithm eliminates irrelevant information as well and permits only an approximate reconstruction of the original file. Lossy compression is also done by re-writing the data in a more space-efficient way, but more than that: less important details of the image are manipulated or even removed so that higher compression rates are achieved. Lossy compression is dangerously attractive because it can provide compression ratios of 100:1 to 200:1, depending on the type of information being compressed. But the cost is loss of data.

The advantage of lossy methods over lossless methods is that in some cases a lossy method can produce a much smaller compressed file than any known lossless method, while still meeting the requirements of the application.

Examples of Lossy Methods are:

  • PCM
  • JPEG
  • MPEG


[1] “Compression Concepts”, available online at: http://www.gitta.info/DataCompress/en/html/CompIntro_learningObject2.html

[2] Dinesh Thakur, “Data Compression-What is the Data Compression? Explain Lossless Compression and Lossy Compression”, available online at: http://ecomputernotes.com/computer-graphics/basic-of-computer-graphics/data-compression

[3] Gaurav Sethi, Sweta Shaw, Vinutha K, and Chandrani Chakravorty, “Data Compression Techniques”, (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 5 (4), 2014, pp. 5584-5586

[4] Hosseini, Mohammad, “A survey of data compression algorithms and their applications.” Network Systems Laboratory, School of Computing Science, Simon Fraser University, BC, Canada (2012).

An Introduction of Vehicular Ad Hoc Networks (VANET)

Vehicular Ad Hoc Networks (VANETs) is a technology that will enable connectivity of users on the move and also implement Intelligent Transportation Systems (ITS). In VANET, nodes can not freely move around an area or surface; their movements are restricted within the roads. VANET is a type of Mobile Ad Hoc Network (MANET), which is a group of mobile wireless nodes, which cooperatively form an IP-based network. A node communicates directly with nodes within its wireless communication range. Nodes of the MANET beyond each other’s wireless range communicate using a multi-hop route through intermediate nodes. The multi-hop routes can change the network topology with time. The best route is determined using a routing protocol such as DSDV, DSR, AODV, TORA, ZRP, etc.

Vehicular Ad Hoc Networks (VANET) are used to provide communication among nearby vehicles and between vehicles and nearby fixed equipment, usually described as Road Side Units (RSU). VANET technologies aim at enhancing traffic safety for drivers, providing comfort, or reducing transportation time and fuel consumption. VANET is a technology that uses moving cars as nodes to create a network. Vehicular Ad Hoc Networks (VANET) turn every car into a wireless router or node, allowing cars approximately 100 to 300 meters of each other and, creating a network with a wide range. One of the greatest challenges of VANETs is to establish cost-effective connections between vehicles and vehicles or between vehicles and RSUs.

Vehicular Ad Hoc Networks (VANET) is an emerging technology to achieve intelligent inter-vehicle communications, and seamless internet connectivity resulting in improved road safety, essential alerts, and accessing comforts and entertainment. The technology integrates WLAN/cellular and Ad Hoc networks to achieve continuous connectivity. Broadcasting in vehicular ad hoc networks (VANET) is emerging as a critical area of research. One of the challenges posed by this problem is the confinement of the routing problem to vehicle-to-vehicle (V2V) scenarios as opposed to also utilizing the wireless infrastructure. At a fundamental level, safety and transport efficiency is a mandate for current car manufacturers and this has to be provided by the cars on the road as opposed to also using the existing wireless communications infrastructure.

VANET suggests an unlimited advantage to companies of any size. Vehicles access fast-speed internet which will change the automobiles’ onboard system from an effective widget to necessary production equipment, making nearly any internet technology accessible in the car. Thus this network does pretend to have specific security concerns as one problem is no one can type an email during driving safely. This is not a potential limit of VANET as production equipment. It permits the time which has been wasted for something in waiting called “dead time”, has turned into the time which is used to achieve tasks called “live time”.

If a traveler downloads his email, he can transform jam traffic into a productive task and read the on-board system and read it himself if traffic is stuck. One can browse the internet when someone is waiting in the car for a relative or friend. If a GPS system is integrated it can give us a benefit about traffic related to reports to support the fastest way to work. Finally, it would permit for free, like Skype or Google Talk services within workers, reducing telecommunications charges.

The main goal of Vehicular Ad Hoc Networks (VANET) is to provide safety and comfort for passengers. To this end, special electronic devices will be placed inside each vehicle which will provide an Ad-hoc network and server communication. Each vehicle equipped with a VANET device will be a node in the ad-hoc network and can receive and relay other messages through the wireless network. There are also multimedia and internet connectivity facilities for passengers, all provided within the wireless coverage for each car.

Characteristics of VANET

  • Rapid topology changes and frequent fragmentation, result in small effective network diameter
  • Virtually no power constrains
  • Variable, highly dynamic scale and network density
  • The driver might adjust his behavior in reacting to the data received from the network, inflicting a topology change

High Dynamic topology: The speed and choice of path define the dynamic topology of VANET. If we assume two vehicles moving away from each other with a speed of 60 mph ( 25m/sec) and if the transmission range is about 250m, then the link between these two vehicles will last for only 5 seconds ( 250m/ 50ms-1). This defines its highly dynamic topology.

Frequent disconnected Network: The above feature necessitates that about every 5 seconds or so, the nodes needed another link with a nearby vehicle to maintain seamless connectivity. But in case of such failure, particularly in the case of low vehicle density zone, frequent disruption of network connectivity will occur. Such problems are at times addressed by road-side deployment of relay nodes.

Mobility Modeling and Prediction: The above features for connectivity, therefore, needed the knowledge of node positions and their movements which as such is very difficult to predict keeping in view the nature and pattern of movement of each vehicle. Nonetheless, a mobility model and node prediction based on a study of predefined roadways model and vehicle speed are of paramount importance for effective network design.

Communication Environment: The mobility model highly varies from highways to that of the city environment. The node prediction design and routing algorithm also, therefore, need to adapt to these changes. The highway mobility model, which is essentially a one-dimensional model, is rather simple and easy to predict. But for the city mobility model, street structure, variable node density, presence of buildings, and trees that behave as obstacles to even small distance communication make the model application very complex and difficult.

Features of VANET

  • The nodes in a Vehicular Ad Hoc Networks (VANET) are vehicles and roadside units
  • The movement of these nodes is very fast
  • The motion patterns are restricted by road topology
  • The vehicle acts as a transceiver i.e. sending and receiving at the same time while creating a highly dynamic network, which is continuously changing.
  • The vehicular density varies from time to time for instance their density might increase during peak office hours and decrease at night times.

 Application of VANET

Three major classes of applications possible in VANET are

  • Safety oriented
  • Convenience oriented
  • Commercial oriented

 Routing protocols in VANET

  • Ad-hoc routing
  • Position-based routing
  • Cluster routing
  • Broadcast-based routing
  • Geocast based routing

Ad-hoc routing: AODV (Ad Hoc on-demand distance vector) and DSR (Dynamic source routing) can be applied to VANET. However, the simulation of these algorithms in VANET brought out frequent communication breaks due to the highly dynamic nature of its nodes. To meet the VANET challenges, these existing algorithms are suitably modified. The following application in their model:

  • A highly partitioned highway scenario is used where most path segments are relatively small.
  • The initial simulation with the AODV algorithm resulted in frequent link breaks as expected, owing to the dynamic nature of the node’s mobility.
  • Two predictions are added to AODV to upgrade the algorithm.
  • In one, node position and their speed information are fed in AODV to predict link lifetime. This is referred to as PR-AODV and it constructs a new alternate link before the end of the estimated link lifetime. (In AODV, the link is created only after the failure of connectivity occurs).
  • In second modified algorithm (PRAOVD-M), it computed the maximum predicted lifetime among various route options (in contrast to selecting the shortest path as in PRAODV or AODV).
  • The simulation on both showed an improved packet driving ratio.
  • However, the success of this algorithm largely depends on the authenticity of node position and mobility.

In another model, AODV is modified to forward the route request within a zone (rectangular or circular) of relevance (ZOR) from the point of event occurrence to make the algorithm more effective.

Position-based Routing: The technique employs the awareness of vehicles about the position of another vehicle to develop the routing strategy. One of the best-known position-based routings is GPSR (Greedy Perimeter Stateless Routing) which works in the principle of combining greed forwarding and face routing. This algorithm has the following advantages and constraints.

  • It works best in open space scenarios (Highways) with evenly distributed nodes. The absence of fewer obstacles in highway scenarios is attributed to its good performance.
  • The comparison of simulation results of GPSR from that of DSR in highway scenarios is generally considered to be better.
  • In city conditions, GPSR suffers from many problems:
  1. Greedy forwarding is restricted owing to obstacles
  2. Routing performance degrades because of the longer path resulting in higher delays
  3. Node mobility can induce routing loops for face routing
  4. The packet can at times be forwarded in the wrong direction resulting in higher delays

Cluster-based routing: In cluster-based routing, several clusters of nodes are formed. Each cluster is represented by a cluster head. Inter-communication among different clusters is carried through cluster heads whereas intra-communication within each cluster is made through the direct link. This cluster algorithm, in general, is more appropriate for MANET. But for VANET, owing to its high speed, and unpredictable variation of mobility, the continuity of links in the cluster often breaks. Certain modifications in the algorithm (COIN – Clustering for Open IVC Network put forth by Blum et al.LORA-CBF – Location-based Routing Algorithm using Cluster-based Flooding suggested by Santos et al.) such as the incorporation of a dynamic movement scheme, expected decisions of a driver under a certain scenario, enhancing the tolerance limit of inter-vehicle distances are included that on are observed to provide more stable structure at the cost of little additional overhead.

Broadcast-based Routing: This is the most frequently used routing protocol in Vehicular Ad Hoc Networks (VANETs) especially to communicate safety-related messages. The simplest broadcast method is carried by flooding in which each node rebroadcasts the message to other nodes. This ascertains the arrival of messages to all targeted destinations but has a higher overhead cost. Moreover, it works well with a lesser number of nodes in the network. A larger density of nodes causes an exponential increase in message transmission leading to collisions, higher bandwidth consumption, and a drop in overall performance. Several selective forwarding schemes such as BROADCOMM (by Durresi et al.), UMB (Urban Multihop Broadcast Protocol), Vector-based Tracking Detection (V-TRADE), History Enhanced V-TRADE (HV-TRADE), etc are proposed to counter this network congestion.

  • BROADCOMM Scheme: In this, the highway is segmented to define virtual cells which move along with the vehicles. Only the selected few nodes in each virtual cell ( cell reflectors) are responsible for handling messages within its cell nodes and forwarding / receiving the messages to/ from neighboring cell reflectors. The protocol works well with a smaller number of nodes with a simple highway structure.
  • UMB: In UMB protocol, each node while broadcasting the message, assigns only the farthest node to forward the message (rebroadcast). At the street intersections, repeaters are installed to forward the package to all road segments. This scheme has a higher success ratio and also can overcome interference, packet collisions, etc. to a great extent.
  • V-TRADE / HV-TRADE: This scheme is a GPS-based protocol. Based on position and movement information, each node classifies its neighboring nodes into different groups and while forwarding the message to neighboring nodes, it assigns only a few border nodes of each group to forward the packets. Because of the lesser number of nodes assigned for multi-hopping, it indicated significant bandwidth utilization.
  • Geocast-based Routing: It is a location-based multicast routing protocol. As the name implies, each node delivers the message/ packet to other nodes that lie within a specified geographic region predefined based on ZOR (zone of relevance). The philosophy is that the sender node need not deliver the packet to nodes beyond the ZOR, as the information (related to the accident, important alerts for example) would have the least importance to distant nodes. The scheme followed a directed flooding strategy within a defined ZOR so that it can limit the message overhead.



Key management mechanisms for secure VANEToperation turn out to be a surprisingly intricate and challenging endeavor, because of multiple seemingly conflicting requirements. On one hand, vehicles need to authenticate vehicles that they communicate with; and road authorities would like to trace drivers that abuse the system. On the other hand, VANETs need to protect a driver’s privacy. In particular, drivers may not wish to be tracked down wherever they travel.

A VANET key management mechanism should provide the following desirable properties:

Authenticity: A vehicle needs to authenticate other legitimate vehicles, and messages sent out by other legitimate vehicles. A vehicle should filter out bogus messages injected by a malicious outsider and accept only messages from legitimate participants.

Privacy: RSUs and casual observers should not be able to track down a driver’s trajectory in the long term.  Authorities can already trace vehicles through cameras and automatic license-plate readers, however, Vehicular Ad Hoc Networks (VANETs) should not make such tracing any simpler. The privacy requirements are seemingly contradictory to the authenticity requirement: suppose each vehicle presents a certificate to vouch for its validity, then different uses of the same certificate can be linked to each other. In particular, suppose a vehicle presents the certificate to an RSU in one location; and later presents the same certificate to another RSU in a different location. Then if these two RSUs compare the information that they have collected, they can easily learn that the owner of the certificate has traveled from one location to another.

Traceability and Revocation: An authority should be able to trace a vehicle that abuses the Vehicular Ad Hoc Networks (VANETs). In addition, once a misbehaving vehicle has been traced, the authority should be able to revoke it in a timely manner. This prevents any further damage that the misbehaving vehicle might cause to the VANET.

Efficiency: To make VANETs economically viable, theOBUs have resource-limited processors. Therefore, the cryptography used in VANET should not incur heavy computational overhead.


[1]Feliz Kristianto Karnadi, Kun-chan Lan and Zhi Hai Mo, “Rapid Generation of Realistic Mobility Models for VANET”

[2] Fan Bai, Priyantha Mudalige and Varsha Sadekar, “Broadcasting in VANET”

[3]Rezwana Karim, “VANET: Superior System for Content Distribution in Vehicular Network”

[4] Different Routing Techniques in VANET

[5] Aamir Hassan, “VANET Simulation”

[6] SanketNesargi, Ravi Prakash, “MANETconf: Configuration of Hosts in a Mobile Ad Hoc Network”

[7] PavlosSermpezis, GeorgiosKoltsidas, and Fotini-NioviPavlidou, “Investigating a Junction-based Multipath Source Routing algorithm for VANETs”

[8] Rongxing L, Xiaodong Lin, Haojin Zhu, and Xuemin (Sherman) Shen, “SPARK: A New VANET-based Smart ParkingScheme for Large Parking Lots”

[9] AhrenStuder, Elaine Shi, Fan Bai, and Adrian Perrig, “TACKing Together Efficient Authentication, Revocation, and Privacy in VANETs”, March 14, 2008

5G Technology – Working, Advantages.

At the end of 2018, the industry association 3GPP(1) defines any system that uses “5G NR” (5G New Radio) software as “5G”, which is the fifth generation of cellular network technology.5G has brought three new aspects – higher speed, lower latency, and connection of multiple devices both as sensors and IoT devices. 5G system is a non-stand-alone network because it still needs active 4G support for the initial connection. It still needs several years of development to become a stand-alone system. 

Figure : Evolution of Technologies

The 5th generation mobile network offers key technological features beyond what legacy 4G currently provides (5GPPP 2015).

  1.  Very low latency: less than 1ms.
  2.  Higher data speeds: up to 10 Gbps. 
  3. Significantly higher wireless capacity (mmWave spectrum), allowing massive-device connectivity. 
  4.  Reduced energy consumption. 
  5.  Unconventional resource virtualization. 
  6. . On-demand service-oriented resource allocation. 
  7.  Automated management and orchestration. 
  8.  Multi-tenancy. 

5G runs on the same radio frequencies that are currently being used for your smartphone, on Wi-Fi networks, and in satellite communications, but it enables technology to go a lot further. Beyond being able to download a full-length HD movie to your phone in seconds (even from a crowded stadium), 5G is really about connecting things everywhere – reliably, without lag – so people can measure, understand and manage things in real-time. 5G technology has a theoretical peak speed of 20 Gbps, while the peak speed of 4G is only 1 Gbps. 5G also promises lower latency, which can improve the performance of business applications as well as other digital experiences (such as online gaming, videoconferencing, and self-driving cars). While earlier generations of cellular technology (such as 4G LTE) focused on ensuring connectivity, 5G takes connectivity to the next level by delivering connected experiences from the cloud to clients. 5G networks are virtualized and software-driven, and they exploit cloud technologies. The 5G network will also simplify mobility, with seamless open roaming capabilities between cellular and Wi-Fi access. Mobile users can stay connected as they move between outdoor wireless connections and wireless networks inside buildings without user intervention or the need for users to reauthenticate. The new Wi-Fi 6 wireless standard (also known as 802.11ax) shares traits with 5G, including improved performance. Wi-Fi 6 radios can be placed where users need them to provide better geographical coverage and lower cost. Underlying these Wi-Fi 6 radios is a software-based network with advanced automation. 

How 5G Works   

5G technology will introduce advances throughout network architecture. 5G New Radio, the global standard for a more capable 5G wireless air interface, will cover spectrums not used in 4G. New antennas will incorporate a technology known as massive MIMO (multiple inputs, multiple outputs), which enables multiple transmitters and receivers to transfer more data at the same time. But 5G technology is not limited to the new radio spectrum. It is designed to support a converged, heterogeneous network combining licensed and unlicensed wireless technologies. This will add bandwidth available for users. 

5G architectures will be software-defined platforms, in which networking functionality is managed through software rather than hardware. Advancements in virtualization, cloud-based technologies, and IT and business process automation enable 5G architecture to be agile and flexible and to provide anytime, anywhere user access. 5G networks can create software-defined subnetwork constructs known as network slices. These slices enable network administrators to dictate network functionality based on users and devices.

5G also enhances digital experiences through machine-learning (ML)-enabled automation. Demand for response times within fractions of a second (such as those for self-driving cars) requires 5G networks to enlist automation with ML and, eventually, deep learning and artificial intelligence (AI). Automated provisioning and proactive management of traffic and services will reduce infrastructure costs and enhance the connected experience.  

Applications of  5G Technology 

Autonomous Vehicles

Autonomous vehicles are one of the most anticipated 5G applications. Vehicle technology is advancing rapidly to support the autonomous vehicle future. 5G networks will be an enormous enabler for autonomous vehicles, due to the dramatically reduced latency, as vehicles will be able to respond 10-100 times faster than over current cellular networks. The ultimate goal is a vehicle-to-everything (V2X) communication network. This will enable vehicles to automatically respond to objects and changes around them almost instantaneously. A vehicle must be able to send and receive messages in milliseconds to break or shift directions in response to road signs, hazards, and people crossing the street. 

5G IoT in Smart City Infrastructure and Traffic Management 

Many cities around the world today are deploying intelligent transportation systems (ITS), and are planning to support connected vehicle technology. Aspects of these systems are relatively easy to install using current communications systems that support smart traffic management to handle vehicle congestion and route emergency vehicles. Connected vehicle technology will enable bidirectional communications from vehicle to vehicle (V2V), and vehicle to infrastructure, (V2X) to promote safety across transportation systems. Smart cities are now installing sensors in every intersection to detect movement and cause connected and autonomous vehicles to react as needed. 

5G IoT Applications in Industrial Automation 

The key benefits of 5G in the industrial automation space are wireless flexibility, reduced costs, and the viability of applications that are not possible with current wireless technology. With 5G, industrial automation applications can cut the cord and go fully wireless, enabling more efficient smart factories. 

Augmented Reality (AR) and Virtual Reality (VR) 

The low latency of 5G will make AR and VR applications both immersive and far more interactive. In industrial applications, for example, a technician wearing 5G AR goggles could see an overlay of a machine that would identify parts, provide repair instructions, or show parts that are not safe to touch. The opportunities for highly responsive industrial applications that support complex tasks will be extensive. In business environments, you can have AR meetings where it appears two people are sitting together in the same room, turning boring phone or 2D video conferences into more interactive 3D gatherings. Sporting events and experiences will likely be some of the top applications for 5G in the consumer space. Anytime you need to react quickly to a stimulus, such as in a sports training application, it must happen with minimal latency. 

5G IoT Applications for Drones 

Drones have a vast and growing set of use cases today beyond the consumer use for filming and photography. With 5G, however, you will be able to put on goggles to “see” beyond current limits with low latency and high-resolution video. 5G will also extend the reach of controllers beyond a few kilometers or miles. These advances will have implications for use cases in search and rescue, border security, surveillance, drone delivery services, and more.   

Advantages of 5G Technology

  1.  Increased speed and bandwidth. 
  2.  Greater device density aids mobile e-commerce. 
  3.  Improved WAN connections. 
  4.  Better battery life for remote IoT devices. 
  5.  Enhanced security with hardened endpoints. 


  1. “https://www.researchgate.net/profile/Hardik_Modi3/publication/237844790_5G_technology_of_mobile_communication_A_survey/links/54e1db900cf2966637932c73/5G-technology-of-mobile-communication-A-survey.pdf”>.
  2. http://foresight.ifmo.ru/ict/shared/files/201310/1_42.pdf
  3. https://www.sciencedirect.com/science/article/pii/S2405428317300151 
  4. https://idoc.pub/documents/5g-nr-the-next-generation-wireless-access-technology-9n0o1q9v6xnv 
  5. https://www.qualcomm.com/invention/5g/what-is-5g

Resource Provisioning: A Significant View

The cloud computing paradigm offers users rapid on demand access to computing resources such as CPU, RAM and storage, with minimal management overhead. Recent commercial cloud platforms, organize a shared resource pool for serving their users. Virtualization technologies help cloud providers pack their resources into different types of virtual machines (VMs), for allocation to cloud users. Under static provisioning, the cloud assembles its available resources into different types of VMs based on simple heuristics or historical VM demand patterns, before the auction starts. Under dynamic provisioning, the cloud conducts VM assembling in an online fashion upon receiving VM bundle bids, targeting maximum possible social welfare given the current bid profile [15].

System model for cloud environment is comprised of cloud producer, virtual machine repository, cloud brokers and cloud consumers as shown in Figure

Figure 2.2 Cloud Environment System Model

Services are offered to users on rental basis to run their applications and pay- by-the-time basis for creating instances. Since these services are publicly available we often refer to them as public clouds. Parallel to this are the private clouds which are managed for solitary purpose. These clouds are dedicated to either inter organization or single consumer services. A quantity of hybrid clouds i.e. combination of private and public clouds are also available for consumers. Large numbers of cloud services are also available to share infrastructure between several organizations from specific community with common concerns for example security, compliance and jurisdiction. This can be managed internally or by a third party and hosted internally or externally [16]. All these models are shown in Figure

Figure 2.3 Cloud Services and Development Models

Resource provisioning will vary from consumer to consumer as the requirements can vary. In general, consumer makes a request to the producer for a specific resource. On receiving consumer‘s request the producer will perform a search in his list of available resources. If the resource is available then the producer allocates resource based on the priority of the request for that particular resource. In case if the resource is not available, consumer has to send a request to another resource provider. For such cases the producer will encounter matchmaking problem i.e. for every request made by different consumers, the producer has to initiate the search mechanism. Consumer on the other end will send the request to multiple producers and will opt for the fastest available resource [17].

Workload in Cloud Computing

Here use the term workload to refer to the utilization of IT resources on which an application is hosted. Workload is the consequence of users accessing the application or jobs that need to be handled automatically. Workload becomes imminent in different forms, depending on the type of IT resource for which it is measured: servers may experience processing load, storage offerings may be assigned larger or smaller amounts of data to store or may have to handle queries on that data. Communication IT resources, such as networking hardware or messaging systems may experience different data or message traffic. In scope of the abstract workload patterns, we merely assume this utilization to be measurable in some form [14].

Static Workload: IT resources with an equal utilization over time experience static workload.Static workloads are characterized by a more-or-less flat utilization profile over time within certain boundaries. This means that there is normally no explicit necessity to add or remove processing power, memory or bandwidth for change in workload reasons. When provisioning for such a workflow the necessary IT resources can be provisioned for this static load plus a certain overprovisioning rate to deal with the minimal variances in the workload. There is a relatively low cost overhead for this minimal overprovisioning.

Periodic Workload: IT resources with a peaking utilization at reoccurring time intervals experience periodic workload.In our real-lives periodic tasks and routines are very common. For example, monthly paychecks, monthly telephone bills, yearly car checkups, weekly status reports, or the daily use of public transport during rush-hour, all these tasks and routines occur in well-defined intervals. They are also characterized by the fact that a lot of people perform them at the same intervals. As a lot of the business processes supporting these tasks and routines are supported by IT systems today, there is a lot of periodic utilization that occurs on these supporting IT systems.

Once-in-a-Lifetime Workload:IT resources with an equal utilization over time disturbed by a strong peak occurring only once experience once-in-a-lifetime workload.As a special case of periodic workload, the peaks of periodic utilization can occur only once in a very long timeframe. Often, this peak is known in advance as it correlates to a certain event or task. Even though this means that the challenge to acquire the needed resources does not arise frequently, it can be even more severe. The discrepancy between the regularly required number of IT resources and those required during the rare peak is commonly greater than for periodic workloads. This discrepancy makes long term investments in IT resources to handle this onetime peak very inefficient. However, due to the severe difference between the regularly required IT resources and those required for the one-time peak, the demand can often not be handled at all without increasing IT resources.

Unpredictable Workload: IT resources with a random and unforeseeable utilization over time experience unpredictable workload.Random workloads are a generalization of periodic workloads as they require elasticity but are not predictable. Such workloads occur quite often in the real world. For example, sudden increases of Website accesses due to weather phenomena or shopping-sprees when new products gain an unforeseen attention and public interest. The resulting occurrence of peaks or at least their height and duration often cannot be foreseen in advance under these conditions [14].

Open Source Resources in Cloud Computing

Eucalyptus: Released as an open-source (under a FreeBSD-style license) infrastructure for cloud computing on clusters that duplicates the functionality of Amazon’s EC2, Eucalyptus directly uses the Amazon command-line tools. Startup Eucalyptus Systems was launched this year with venture funding, and the staff includes original architects from the Eucalyptus project. The company recently released its first major update to the software framework, which is also powering the cloud computing features in the new version of Ubuntu Linux [13].

Red Hat’s Cloud: Linux-focused open-source player Red Hat has been rapidly expanding its focus on cloud computing. At the end of July, Red Hat held its Open Source Cloud Computing Forum, which included a large number of presentations from movers and shakers focused on open-source cloud initiatives. You can find free webcasts for all the presentations here. Stevens’ webcast can bring you up to speed on Red Hat’s cloud strategy. Novell is also an open source-focused company that is increasingly focused on cloud computing, and you can read about its strategy here [13].

Cloudera: The open-source Hadoop software framework is increasingly used in cloud computing deployments due to its flexibility with cluster-based, data-intensive queries and other tasks. It’s overseen by the Apache Software Foundation, and Yahoo has its own time-tested Hadoop distribution. Cloudera is a promising startup focused on providing commercial support for Hadoop [13].

Traffic Server: Yahoo this week moved its open-source cloud computing initiatives up a notch with the donation of its Traffic Server product to the Apache Software Foundation. Traffic Server is used in-house at Yahoo to manage its own traffic, and it enables session management, authentication, configuration management, load balancing, and routing for entire cloud computing software stacks. Acting as an overlay to raw cloud computing services, Traffic Server allows IT administrators to allocate resources, including handling thousands of virtualized services concurrently [13].

Exit mobile version