What is Deep Learning : A Basic Concept

Machine-learning technology powers many aspects of modern society: from web searches to content filtering on social networks to recommendations on e-commerce websites, and it is increasingly present in consumer products such as cameras and smartphones. Machine-learning systems are used to identify objects in images, transcribe speech into text, match news items, posts, or products with users’ interests, and select relevant results of a search. Increasingly, these applications make use of a class of techniques called deep learning. Deep learning is a machine learning technique that teaches computers to do what comes naturally to humans: learn by example.


Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks. In deep learning, a computer model learns to perform classification tasks directly from images, text, or sound. Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding human-level performance. Models are trained by using a large set of labeled data and neural network architectures that contain many layers. Deep learning is a subset of machine learning. Usually, when people use the term deep learning, they are referring to deep artificial neural networks, and somewhat less frequently to deep reinforcement learning.

Deep is a technical term. It refers to the number of layers in a neural network. A shallow network has one so-called hidden layer, and a deep network has more than one. Multiple hidden layers allow deep neural networks to learn features of the data in a so-called feature hierarchy, because simple features (e.g. two pixels) recombine from one layer to the next, to form more complex features (e.g. a line). Nets with many layers pass input data (features) through more mathematical operations than nets with few layers and are therefore more computationally intensive to train. Computational intensity is one of the hallmarks of deep learning, and it is one reason why GPUs are in demand to train deep-learning models.

Deep learning is a key technology behind driverless cars, enabling them to recognize a stop sign or to distinguish a pedestrian from a lamppost. It is the key to voice control in consumer devices like phones, tablets, TVs, and hands-free speakers. Deep learning is getting lots of attention lately and for good reason. It’s achieving results that were not possible before.

Examples of Deep Learning at Work

Deep learning applications are used in industries from automated driving to medical devices.

Automated Driving: Automotive researchers are using deep learning to automatically detect objects such as stop signs and traffic lights. In addition, deep learning is used to detect pedestrians, which helps decrease accidents.

Aerospace and Defense: Deep learning is used to identify objects from satellites that locate areas of interest, and identify safe or unsafe zones for troops.

Medical Research: Cancer researchers are using deep learning to automatically detect cancer cells. Teams at UCLA built an advanced microscope that yields a high-dimensional data set used to train a deep learning application to accurately identify cancer cells.

Industrial Automation: Deep learning is helping to improve worker safety around heavy machinery by automatically detecting when people or objects are within an unsafe distance of machines.

Electronics: Deep learning is being used in automated hearing and speech translation. For example, home assistance devices that respond to your voice and know your preferences are powered by deep learning applications.

How Deep Learning Works

Most deep learning methods use neural network architectures, which is why deep learning models are often referred to as deep neural networks.

The term “deep” usually refers to the number of hidden layers in the neural network. Traditional neural networks only contain 2-3 hidden layers, while deep networks can have as many as 150. Deep learning models are trained by using large sets of labeled data and neural network architectures that learn features directly from the data without the need for manual feature extraction.

Figure 1 Neural networks, which are organized in layers consisting of a set of interconnected nodes

One of the most popular types of deep neural networks is known as convolutional neural networks (CNN or ConvNet). A CNN convolves learned features with input data and uses 2D convolutional layers, making this architecture well-suited to processing 2D data, such as images.

Difference between Machine Learning and Deep Learning

Deep learning is a specialized form of machine learning. A machine learning workflow starts with relevant features being manually extracted from images. The features are then used to create a model that categorizes the objects in the image. With a deep learning workflow, relevant features are automatically extracted from images. In addition, deep learning performs “end-to-end learning” – where a network is given raw data and a task to perform, such as classification, and it learns how to do this automatically.

Another key difference is deep learning algorithms scale with data, whereas shallow learning converges. Shallow learning refers to machine learning methods that plateau at a certain level of performance when you add more examples and training data to the network.

A key advantage of deep learning networks is that they often continue to improve as the size of your data increases.

Figure 2: Comparing a machine learning approach to categorizing vehicles (left) with deep learning (right)

In machine learning, you manually choose features and a classifier to sort images. With deep learning, feature extraction, and modeling steps are automatic.


[1] LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton, “Deep learning”, Nature 521, Number 7553 (2015): pp. 436-444.

[2] “What Is Deep Learning? 3 things you need to know”, available online at: https://in.mathworks.com/discovery/deep-learning.html

[3] “Artificial Intelligence, Machine Learning, and Deep Learning”, available online at: https://deeplearning4j.org/ai-machinelearning-deeplearning

[4] Navdeep, Ingestion, and Processing of Data for Big Data and IoT Solutions March 03, 2017, available online at: https://www.xenonstack.com/blog/ingestion-processing-data-for-big-data-iot-solutions

What is a Multi Agent System

Multi-agent systems are made up of multiple interacting intelligent agents—computational entities to some degree autonomous and able to cooperate, compete, communicate, act flexibly, and exercise control over their behavior within the frame of their objectives. They are the enabling technology for a wide range of advanced applications relying on distributed and parallel processing of data, information, and knowledge relevant in domains ranging from industrial manufacturing to e-commerce to health care.

What is it?

In artificial intelligence research, agent-based systems technology has been hailed as a new paradigm for conceptualizing, designing, and implementing software systems. Agents are sophisticated computer programs that act autonomously on behalf of their users, across open and distributed environments, to solve a growing number of complex problems. Increasingly, however, applications require multiple agents that can work together. A multi-agent system (MULTI-AGENT SYSTEM) is a loosely coupled network of software agents that interact to solve problems that are beyond the individual capacities or knowledge of each problem solver. The multi-agent system can be defined by the following definition:

“A multi-agent system is a loosely coupled network of problem-solving entities (agents) that work together to find answers to problems that are beyond the individual capabilities or knowledge of each entity (agent)”.

The trend toward the development of increasingly intelligent systems is matched only by the trend toward the distribution of computing. The science of multiagent systems lies at the intersection of these trends. Multi-agent systems are of great significance in a number of current and future applications of computer science. For example, they arise in systems for electronic data interchange, air traffic control, manufacturing automation, computer-supported cooperative work, and electronic banking, as well as in robotics and heterogeneous information systems.

An agent is a computerized entity like a computer programmer or a robot. An agent can be described as autonomous because it has the capacity to adapt when its environment changes. A multi-agent system is made up of a set of computer processes that occur at the same time, i.e. several agents that exist at the same time, share common resources, and communicate with each other. The key issue in multi-agent systems is to formalize the coordination between agents. Research on agents, therefore, includes research into:

  • Decision-making: what decision-making mechanisms are available to the agent? What is the link between their perceptions, representations, and actions?
  • Control: what hierarchic relationships exist between agents? How are they synchronized?
  • Communication: what kind of messages do they send each other? What syntax do these messages obey?
  • Figure 1: Multi-Agent System Cooperation typology
  • Multi-agent systems can be applied to artificial intelligence. They simplify problem-solving by dividing the necessary knowledge into subunits-to which an independent intelligent agent is associated-and by coordinating the agents’ activity. In this way, we refer to distributed artificial intelligence. This method can be used for monitoring an industrial process, for example, when the sensible solution -that of coordinating several specialized monitors rather than a single omniscient one- is adopted.

    The fact that the agents within a MULTI-AGENT SYSTEM work together implies that a sort of cooperation among individual agents is to be involved. However, the concept of cooperation in a MULTI-AGENT SYSTEM is at best unclear and at worst highly inconsistent, so that the terminology, possible classifications, etc., are even more problematic than in the case of agents what makes any attempt to present MULTI-AGENT SYSTEM a hard problem. A typology of cooperation seems the simplest and here we start with this typology as the basis for MULTI-AGENT SYSTEM classification. The typology is given in Figure 4.

    Advantages of a Multi-Agent Approach

    A MULTI-AGENT SYSTEM has the following advantages over a single-agent or centralized approach

    • A MULTI-AGENT SYSTEM distributes computational resources and capabilities across a network of interconnected agents. Whereas a centralized system may be plagued by resource limitations, performance bottlenecks, or critical failures, a MULTI-AGENT SYSTEM is decentralized and thus does not suffer from the “single point of failure” problem associated with centralized systems.
    • A MULTI-AGENT SYSTEM allows for the interconnection and interoperation of multiple existing legacy systems. By building an agent wrapper around such systems, they can be incorporated into an agent society.
    • A MULTI-AGENT SYSTEM models problems in terms of autonomous interacting component agents, which is proving to be a more natural way of representing task allocation, team planning, user preferences, open environments, and so on.
    • A MULTI-AGENT SYSTEM efficiently retrieves, filters, and globally coordinates information from sources that are spatially distributed.
    • A MULTI-AGENT SYSTEM provides solutions in situations where expertise is spatially and temporally distributed.
    • A MULTI-AGENT SYSTEM enhances overall system performance, specifically along the dimensions of computational efficiency, reliability, extensibility, robustness, maintainability, responsiveness, flexibility, and reuse.


    [1] Mevludin Glavic, “Agents and multi-agent systems: a short introduction for power engineers”, Technical Report, May 2006.

    [2] “Multi-Agent Systems”, available online at: http://cormas.cirad.fr/en/demarch/sma.htm

    [3] “Multi-Agent Systems”, available online at: https://www.cs.cmu.edu/~softagents/multi.html

What is Human-Computer Interaction (HCI)

Utilizing computers had always begged the question of interfacing. The methods by which human has been interacting with computers has traveled a long way. The journey still continues and new designs of technologies and systems appear more and more every day and the research in this area has been growing very fast in the last few decades. The growth in Human-Computer Interaction (HCI) field has not only been in quality of interaction, it has also experienced different branching in its history. Instead of designing regular interfaces, the different research branches have had a different focus on the concepts of multimodality rather than unimodality, intelligent adaptive interfaces rather than command/action-based ones, and finally active rather than passive interfaces.


Human-Computer Interaction (HCI) involves the planning and design of the interaction between users and computers. These days, smaller devices are used to improve technology. The most important advantage of computer vision is its freedom. The user can interact with the computer without wires and manipulating intermediary devices. Recently, User-Interfaces are used to capture the motion of our hands. The researchers developed techniques to track the movements of hand/fingers through the webcam to establish an interaction mechanism between the user and computer

Sometimes called Man-Machine Interaction or Interfacing, the concept of Human-Computer Interaction/Interfacing (HCI) was automatically represented by the emergence of the computer, or more generally machine, itself. The reason, in fact, is clear: most sophisticated machines are worthless unless they can be used properly by men. This basic argument simply presents the main terms that should be considered in the design of HCI: functionality and usability [1]

One important HCI factor is that different users form different conceptions or mental models about their interactions and have different ways of learning and keeping knowledge and skills (different “cognitive styles” as in, for example, “left-brained” and “right-brained” people). In addition, cultural and national differences play a part. Another consideration in studying or designing HCI is that user interface technology changes rapidly, offering new interaction possibilities to which previous research findings may not apply. Finally, user preferences change as they gradually master new interfaces.

Figure 1: Field of Human-Computer Interaction


Human-computer interaction (HCI) is the study of how people design, implement, and use interactive computer systems and how computers affect individuals, organizations, and society. This encompasses not only ease of use but also new interaction techniques for supporting user tasks, providing better access to information, and creating more powerful forms of communication. It involves input and output devices and the interaction techniques that use them; how information is presented and requested; how the computer’s actions are controlled and monitored; all forms of help, documentation, and training; the tools used to design, build, test, and evaluate user interfaces; and the processes that developers follow when creating Interfaces.

Human-computer interaction (HCI) is the study of how people design, implement, and use interactive computer systems and how computers affect individuals, organizations, and society. This encompasses not only ease of use but also new interaction techniques for supporting user tasks, providing better access to information, and creating more powerful forms of communication. It involves input and output devices and the interaction techniques that use them; how information is presented and requested; how the computer’s actions are controlled and monitored; all forms of help, documentation, and training; the tools used to design, build, test, and evaluate user interfaces; and the processes that developers follow when creating Interfaces.

The goal of Human-Computer Interaction

The goals of HCI are to produce usable and safe systems, as well as functional systems. Usability is concerned with making systems easy to learn and easy to use [36]. In order to produce computer systems with good usability developers must attempt to:

  • Understand the factors that determine how people use technology
  • Develop tools and techniques to enable the building suitable systems
  • Achieve efficient, effective, and safe interaction
  • Put user first

Underlying the whole theme of HCI is the belief that people using a computer system should come first. Their needs, capabilities, and preferences for conducting various tasks should direct developers in the way that they design systems. People need not change themselves in order to fit in within the system. Instead, the system should be designed to match their requirements.


[1] Fakhreddine Karray and Milad Alemzadeh, “Human-Computer Interaction: Overview on State of the Art”, International Journal on Smart Sensing and Intelligent Systems, Volume 1, Number 1, March 2008

[2] Kinjal N. Shah and Kirit R. Rathod, “A survey on Human-Computer Interaction Mechanism Using Finger Tracking”, International Journal of Computer Trends and Technology (IJCTT) – volume 7 number 3– Jan 2014

[3] “Chapter 1: Introduction”, available online at: http://shodhganga.inflibnet.ac.in/bitstream/10603/13990/6/06_chapter_1.pdf

What is Data Visualization and Applications

A picture is worth a thousand words – especially when we are trying to understand and discover insights from data. Visuals are especially helpful when we’re trying to find relationships among hundreds or thousands of variables to determine their relative importance – or if they are important at all. Regardless of how much data we have, one of the best ways to discern important relationships is through advanced analysis and high-performance data visualization. If sophisticated analyses can be performed quickly, even immediately, and results presented in ways that showcase patterns and allow querying and exploration, people across all levels in our organization can make faster, more effective decisions.


Data visualizations are surprisingly common in our everyday life, but they often appear in the form of well-known charts and graphs. A combination of multiple visualizations and bits of information is often referred to as infographics. Data visualizations can be used to discover unknown facts and trends. You may see visualizations in the form of line charts to display change over time. Bar and column charts are helpful when observing relationships and making comparisons. Pie charts are a great way to show parts of a whole. And maps are the best way to visually share geographical data.

“Data visualization is the presentation of quantitative information in a graphical form. In other words, data visualizations turn large and small datasets into visuals that are easier for the human brain to understand and process”.

Data visualization concerns the manipulation of sampled and computed data for comprehensive display. The goal of the data visualization is to bring to the user a deeper understanding of the data as well as the underlying physical laws and properties. Such visualization may be used to enlighten a physicist on the complex interaction between electrons, to guide the medical practitioner in a surgery situation, or simply to view the surface of a planet, which has never been seen by human eyes.

The important aspects of interactive visualization can be broken down into three categories:

Computation- the ability to speedily compute visualization, this may include computing a polygonal approximation to is surface of a scalar function, the computation of a particle trace through a time-dependent vector field, or any action which requires extracting an abstract object or representation from the data being examined.

Display- the ability to quickly display the computed visualization, display encompasses both computed visualizations as listed above, as well as direct display methods such as volume visualization and ray tracing.

Querying- the ability to interactively probe a displayed visualization for the purpose of further understanding on a fine scale what is begin displayed on a coarser scale.

Importance of Data Visualization

Better Decision Making

Today more than ever, organizations are using data visualizations, and data tools, to ask better questions and make better decisions. Emerging computer technologies and new user-friendly software programs have made it easy to learn more about your company and make better data-driven business decisions. The strong emphasis on performance metrics, data dashboards, and Key Performance Indicators (KPIs) shows the importance of measuring and monitoring company data. Common quantitative information measured by businesses includes units or product sold revenue by quarter, department expenses, employee stats, and company market share.

  • Meaningful Storytelling: Data visualizations and information graphics (infographics) have become essential tools for today’s mainstream media. Data journalism is on the rise and journalists consistently rely on quality visualization tools to help them tell stories about the world around us. Many well-respected institutions have fully embraced data-driven news including The New York Times, The Guardian, The Washington Post, Scientific American, CNN, Bloomberg, The Huffington Post, and The Economist.
  • Data Literacy: Being able to understand and read data visualizations has become a necessary requirement for the 21st century. Because data visualization tools and resources have become readily available, more and more non-technical professionals are expected to be able to gather insights from data.

Data visualization, the use of images to represent information, is only now becoming properly appreciated for the benefits it can bring to business. It provides a powerful means both to make sense of data and to then communicate what we’ve discovered to others. Despite their potential, the benefits of data visualization are undermined today by a general lack of understanding. Many of the current trends in data visualization are actually producing the opposite of the intended effect, confusion rather than understanding. Nothing going on in the field of business intelligence today can bring us closer to fulfilling its promise of intelligence in the workplace than data visualization.

The Importance of Visualizations in Business

A visual can communicate more information than a table in a much smaller space. This trait of visuals makes them more effective than tables for presenting data. For example, notice the table below, and try to spot the month with the highest sales.

Month Jan Feb Mar Apr May Jun
Sales 45 56 36 58 75 62

This data when visualized gives you the same information in a second or two.

Figure 1 An example of data visualization

Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.’ This trait of visualizations is what makes them vital to businesses


[1] Chandrajit Bajaj, “Data Visualization Techniques”, 1998 John Wiley & Sons Ltd

[2] “What is Data Visualization?” available online at: https://infogram.com/page/data-visualization

[3] “Principles of Data Visualization – What We See in a Visual”, White Paper, FusionCharts

[4] Stephen Few and Perceptual Edge, “Data Visualization Past, Present, and Future”, Innovation Center, Wednesday, January 10, 2007

What is Reinforcement Learning in Machine Learning

The idea that we learn by interacting with our environment is probably the first to occur to us when we think about the nature of learning. When an infant plays, waves its arms, or looks about, it has no explicit teacher, but it does have a direct sensorimotor connection to its environment. Reinforcement learning is a computational approach to understanding and automating goal-directed learning and decision-making. It is distinguished from other computational approaches by its emphasis on learning by an agent from direct interaction with its environment, without relying on exemplary supervision or complete models of the environment.

General Overview

Reinforcement learning is learning what to do–how to map situations to actions–so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward but also the next situation and, through that, all subsequent rewards. These two characteristics–trial-and-error search and delayed reward–are the two most important distinguishing features of reinforcement learning.  Reinforcement Learning is a type of Machine Learning, thereby also a branch of Artificial Intelligence. It allows machines and software agents to automatically determine the ideal behavior within a specific context, in order to maximize its performance. Simple reward feedback is required for the agent to learn its behavior; this is known as the reinforcement signal.


Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner’s predictions. Further, the predictions may have long-term effects by influencing the future state of the controlled system. Thus, time plays a special role. The goal of reinforcement learning is to develop efficient learning algorithms.

Figure 1: The Basic Reinforcement Learning Scenario

Reinforcement learning is defined not by characterizing learning methods, but by characterizing a learning problem. Any method that is well suited to solving that problem, we consider to be a reinforcement learning method. Reinforcement learning is different from supervised learning, the kind of learning studied in most current research in machine learning, statistical pattern recognition, and artificial neural networks. Supervised learning is learning from examples provided by a knowledgeable external supervisor. This is an important kind of learning, but alone it is not adequate for learning from interaction. In interactive problems, it is often impractical to obtain examples of desired behavior that are both correct and representative of all the situations in which the agent has to act. In uncharted territory– where one would expect learning to be most beneficial–an agent must be able to learn from its own experience.

Elements of Reinforcement Learning

Beyond the agent and the environment, one can identify four main sub-elements of a reinforcement learning system: a policy, a reward signal, a value function, and, optionally, a model of the environment.

A policy defines the learning agent’s way of behaving at a given time. Roughly speaking, a policy is a mapping from perceived states of the environment to actions to be taken when in those states. It corresponds to what in psychology would be called a set of stimulus-response rules or associations. In some cases, the policy may be a simple function or lookup table, whereas in others it may involve extensive computation such as a search process. The policy is the core of a reinforcement learning agent in the sense that it alone is sufficient to determine behavior. In general, policies may be stochastic.

A reward signal defines the goal in a reinforcement learning problem. On each time step, the environment sends to the reinforcement learning agent a single number, a reward. The agent’s sole objective is to maximize the total reward it receives over the long run. The reward signal thus defines what the good and bad events are for the agent. In a biological system, we might think of rewards as analogous to the experiences of pleasure or pain. They are the immediate and defining features of the problem faced by the agent. As such, the process that generates the reward signal must be unalterable by the agent. The agent can alter the signal that the process produces directly by its actions and indirectly by changing its environment’s state— since the reward signal depends on these—but it cannot change the function that generates the signal.

Whereas the reward signal indicates what is good in an immediate sense, a value function specifies what is good in the long run. Roughly speaking, the value of a state is the total amount of reward an agent can expect to accumulate over the future, starting from that state. Whereas rewards determine the immediate, intrinsic desirability of environmental states, values indicate the long-term desirability of states after taking into account the states that are likely to follow, and the rewards available in those states. For example, a state might always yield a low immediate reward but still have a high value because it is regularly followed by other states that yield high rewards. Or the reverse could be true. To make a human analogy, rewards are somewhat like pleasure (if high) and pain (if low), whereas values correspond to a more refined and farsighted judgment of how pleased or displeased we are that our environment is in a particular state. Expressed this way, we hope it is clear that value functions formalize a basic and familiar idea.

The fourth and final element of some reinforcement learning systems is a model of the environment. This is something that mimics the behavior of the environment, or more generally, that allows inferences to be made about how the environment will behave. For example, given a state and action, the model might predict the resultant next state and next reward. Models are used for planning, by which we mean any way of deciding on a course of action by considering possible future situations before they are actually experienced.


One reason that reinforcement learning is popular is that it serves as a theoretical tool for studying the principles of agents learning to act. But it is unsurprising that it has also been used by a number of researchers as a practical computational tool for constructing autonomous systems that improve themselves with experience. These applications have ranged from robotics to industrial manufacturing, to combinatorial search problems such as computer game playing. Some of the practical applications of reinforcement learning are:

Manufacturing: In Fanuc, a robot uses deep reinforcement learning to pick a device from one box and put it in a container. Whether it succeeds or fails, it memorizes the object and gains knowledge and train’s it to do this job with great speed and precision.

Inventory Management: A major issue in supply chain inventory management is the coordination of inventory policies adopted by different supply chain actors, such as suppliers, manufacturers, and distributors, so as to smooth material flow and minimize costs while responsively meeting customer demand.

Delivery Management: Reinforcement learning is used to solve the problem of Split Delivery Vehicle Routing. Q-learning is used to serve appropriate customers with just one vehicle.

Power Systems: Reinforcement Learning and optimization techniques are utilized to assess the security of the electric power systems and to enhance Microgrid performance. Adaptive learning methods are employed to develop control and protection schemes. Transmission technologies with High-Voltage Direct Current (HVDC) and Flexible Alternating Current Transmission System devices (FACTS) based on adaptive learning techniques can effectively help to reduce transmission losses and CO2 emissions.

Finance Sector: AI is at the forefront of leveraging reinforcement learning for evaluating trading strategies. It is turning out to be a robust tool for training systems to optimize financial objectives. It has immense applications in stock market trading where the Q-Learning algorithm is able to learn an optimal trading strategy with one simple instruction.


[1] Szepesvári, Csaba, “Algorithms for reinforcement learning”, Synthesis lectures on artificial intelligence and machine learning 4, no. 1 (2010): 1-103.

[2] Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. Vol. 1, no. 1, Cambridge: MIT Press, 1998.

[3] Maruti Techlabs, “Reinforcement Learning and Its Practical Applications”, available online at: https://chatbotsmagazine.com/reinforcement-learning-and-its-practical-applications-8499e60cf751.

What is Natural Language processing (NLP)

Natural language processing (NLP) is the relationship between computers and human language. More specifically, natural language processing is the computer understanding, analysis, manipulation, and/or generation of natural language. Will a computer program ever be able to convert a piece of English text into a programmer-friendly data structure that describes the meaning of the natural language text? Unfortunately, no consensus has emerged about the form or the existence of such a data structure. Until such fundamental Artificial Intelligence problems are resolved, computer scientists must settle for the reduced objective of extracting simpler representations that describe limited aspects of the textual information.


Natural language processing (NLP) can be defined as the automatic (or semi-automatic) processing of human language. The term ‘NLP’ is sometimes used rather more narrowly than that, often excluding information retrieval and sometimes even excluding machine translation. NLP is sometimes contrasted with ‘computational linguistics’, with NLP being thought of as more applied. Nowadays, alternative terms are often preferred, like ‘Language Technology’ or ‘Language Engineering’. Language is often used in contrast with speech (e.g., Speech and Language Technology). But I’m going to simply refer to NLP and use the term broadly. NLP is essentially multidisciplinary: it is closely related to linguistics (although the extent to which NLP overtly draws on linguistic theory varies considerably).

What is it?

NLP is a way for computers to analyze, understand, and derive meaning from human language in a smart and useful way. By utilizing NLP, developers can organize and structure knowledge to perform tasks such as automatic summarization, translation, named entity recognition, relationship extraction, sentiment analysis, speech recognition, and topic segmentation. NLP is used to analyze text, allowing machines to understand how humans speak. This human-computer interaction enables real-world applications like automatic text summarization, sentiment analysis, topic extraction, named entity recognition, parts-of-speech tagging, relationship extraction, stemming, and more. NLP is commonly used for text mining, machine translation, and automated question-answering.

Figure1: NLP Techniques

Importance of NLP

Earlier approaches to NLP involved a more rules-based approach, where simpler machine learning algorithms were told what words and phrases to look for in text and given specific responses when those phrases appeared. But deep learning is a more flexible, intuitive approach in which algorithms learn to identify speakers’ intent from many examples, almost like how a child would learn human language.

The advantage of natural language processing can be seen when considering the following two statements: “Cloud computing insurance should be part of every service level agreement” and “A good SLA ensures an easier night’s sleep — even in the cloud.” If you use national language processing for search, the program will recognize that cloud computing is an entity, that cloud is an abbreviated form of cloud computing, and that SLA is an industry acronym for service level agreement.

Some Linguistic Terminology

The subareas loosely correspond to some of the standard subdivisions of linguistics:

  • Morphology: the structure of words. For instance, unusually can be thought of as composed of a prefix un-, a stem usual, and an affix -ly. Composed is compose plus the inflectional affix -ed: a spelling rule means we end up with composed rather than composed.
  • Syntax: The way words are used to form phrases. e.g., it is part of English syntax that a determiner such as the will come before a noun, and also that determiners are obligatory with certain singular nouns.
  • Semantics: Compositional semantics is the construction of meaning (generally expressed as logic) based on syntax. This is contrasted to lexical semantics, i.e., the meaning of individual words.


Here are a few common ways NLP is being used today:

  • Spell check functionality in Microsoft Word is the most basic and well-known application.
  • Text analysis, also known as sentiment analytics, is a key use of NLP. Businesses can use it to learn how their customers feel emotionally and use that data to improve their service.
  • By using email filters to analyze the emails that flow through their servers, email providers can use Naive Bayes spam filtering to calculate the likelihood that an email is spam based its content.
  • Call center representatives often hear the same, specific complaints, questions, and problems from customers. Mining this data for sentiment can produce incredibly actionable intelligence that can be applied to product placement, messaging, design, or a range of other uses.
  • Google, Bing, and other search systems use NLP to extract terms from text to populate their indexes and parse search queries
  • Google Translate applies machine translation technologies in not only translating words, but also in understanding the meaning of sentences to improve translations.
  • Financial markets use NLP by taking plain-text announcements and extracting the relevant info in a format that can be factored into making algorithmic trading decisions. For example, news of a merger between companies can have a big impact on trading decisions, and the speed at which the particulars of the merger (e.g., players, prices, who acquires who) can be incorporated into a trading algorithm can have profit implications in the millions of dollars.

A Few NLP Examples

  • Use Summarizer to automatically summarize a block of text, exacting topic sentences, and ignoring the rest.
  • Generate keyword topic tags from a document using LDA (Latent Dirichlet Allocation), which determines the most relevant words from a document. This algorithm is at the heart of the Auto-Tag and Auto-Tag URL micro-services
  • Sentiment Analysis, based on Stanford NLP, can be used to identify the feeling, opinion, or belief of a statement, from very negative, to neutral, to very positive.


[1] Ann Copestake, “Natural Language Processing”, 2004, 8 Lectures, available online at: https://www.cl.cam.ac.uk/teaching/2002/NatLangProc/revised.pdf

[2] Ronan Collobert and Jason Weston, “Natural Language Processing (Almost) from Scratch”, Journal of Machine Learning Research 12 (2011) pp. 2493-2537

[3] “Top 5 Semantic Technology Trends to look for in 2017”, available online at: https://ontotext.com/top-5-semantic-technology-trends-2017/

What is Data Preprocessing

Data analysis is now integral to our working lives. It is the basis for investigations in many fields of knowledge, from science to engineering and management to process control. Data on a particular topic are acquired in the form of symbolic and numeric attributes. Analysis of these data gives a better understanding of the phenomenon of interest. When the development of a knowledge-based system is planned, the data analysis involves the discovery and generation of new knowledge for building a reliable and comprehensive knowledge base. Exploratory data analysis and predictive analytics can be used to extract hidden patterns from data and are becoming increasingly important tools to transform data into information. Real-world data is generally incomplete and noisy and is likely to contain irrelevant and redundant information or errors. Data preprocessing is an essential step in data mining processes, helps transform the raw data into an understandable format.

Data pre-processing is an essential step in the data mining process. It describes any type of processing performed on raw data to prepare it for another processing procedure. Data preprocessing transforms the data into a format that will be more efficiently and effectively processed for the user. Data pre-processing is a step of the Knowledge discovery in databases (KDD) process that reduces the complexity of the data and offers better conditions for subsequent analysis. Through this, the nature of the data is better understood and the data analysis is performed more accurately and efficiently. Data preprocessing is used in database-driven applications such as customer relationship management and rule-based applications (like neural networks).

Importance of Data Pre-processing

Data have quality if they satisfy the requirements of the intended use. There are many factors comprising data quality, including accuracy, completeness, consistency, timeliness, believability, and interpretability. Real-world data is usually incomplete, (it may contain missing values), noisy, (data may contain errors while transmission or dirty data), and inconsistent, (data may contain duplicate values or unexpected values which lead to inconsistency).  Data preprocessing is a proven method of solving such problems.

No quality data, no quality mining results! which means that if the analysis is performed on low-quality data then the results obtained will also be of a low quality which is not desired in the decision-making process. For a quality result, it is necessary to clean this dirty data. To convert dirty data into quality data, there is a need for data pre-processing techniques.

Techniques of Data Pre-processing

We look at the major steps involved in data preprocessing, namely, data cleaning, data integration, data reduction, and data transformation.

Figure 1 Techniques of Data Pre-processing

Data Cleaning

Data cleaning routines work to “clean” the data by filling in missing values, smoothing noisy data, identifying or removing outliers, and resolving inconsistencies. If users believe the data are dirty, they are unlikely to trust the results of any data mining that has been applied. Furthermore, dirty data can cause confusion in the mining procedure, resulting in unreliable output. Although most mining routines have some procedures for dealing with incomplete or noisy data, they are not always robust. Instead, they may concentrate on avoiding over-fitting the data to the function being modeled. Therefore, a useful preprocessing step is to run your data through some data-cleaning routines.

Data cleaning or data cleansing techniques attempt to fill in missing values, smooth out noise while identifying outliers, and correct inconsistencies in the data. Data cleansing or data cleaning is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database.

Tasks in Data Cleaning:

  • Fill in missing values
  • Identify outliers and smooth noisy data
  • Correct inconsistent data

Fill in Missing Values:

  • Ignore the tuple
  • Fill in the missing values manually
  • Use a global constant to fill in the missing value.
  • Use the most probable value
  • Use the attribute mean or median for all the samples belonging to the same class as the given tuple

Identify outliers and Smooth Noisy Data:

  • Binning
  • Regression
  • Outlier analysis.

Data Integration

Data mining often requires data integration—the merging of data from multiple data stores. Careful integration can help reduce and avoid redundancies and inconsistencies in the resulting data set. This can help improve the accuracy and speed of the subsequent data mining process. Data Integration is a data preprocessing technique that merges the data from multiple heterogeneous data sources into a coherent data store. Data integration may involve inconsistent data and therefore needs data cleaning Data Integration is the process of integrating data from multiple sources and has a single view over all these sources. Data integration can be physical or virtual.

Tasks in Data Integration:

  • Data Integration-Combines data from multiple sources into a single data store.
  • Schema integration-Integrate metadata from different sources
  • Entity identification problem-Identify real-world entities from multiple data sources
  • Detecting and resolving data value conflicts-For the same real-world entity, attribute values from different sources are different
  • Handling Redundancy in Data Integration

Data Transformation

Data transformation is the process of converting data from one format or structure into another format or structure. In this preprocessing step, the data are transformed or consolidated so that the resulting mining process may be more efficient, and the patterns found may be easier to understand.

In data transformation, the data are transformed or consolidated into forms appropriate for mining. Strategies for data transformation include the following:

  • Smoothing, this works to remove noise from the data. Techniques include binning, regression, and clustering.
  • Attribute construction (or feature construction), where new attributes are constructed and added from the given set of attributes to help the mining process.
  • Aggregation, where summary or aggregation operations are applied to the data. For example, the daily sales data may be aggregated so as to compute monthly and annual total amounts. This step is typically used in constructing a data cube for data analysis at multiple abstraction levels.
  • Normalization, where the attribute data are scaled so as to fall within a smaller range, such as −1.0 to 1.0, or 0.0 to 1.0. 5.
  • Discretization, where the raw values of a numeric attribute (e.g., age) are replaced by interval labels (e.g., 0–10, 11–20, etc.) or conceptual labels (e.g., youth, adult, senior). The labels, in turn, can be recursively organized into higher-level concepts, resulting in a concept hierarchy for the numeric attribute.
  • Concept hierarchy generation for nominal data, where attributes such as street can be generalized to higher-level concepts, like city or country. Many hierarchies for nominal attributes are implicit within the database schema and can be automatically defined at the schema definition level.

Data Reduction

A database/data warehouse may store terabytes of data and to perform complex analysis on such voluminous data may take a very long time on the complete data set. Therefore, data reduction is used to obtain a reduced representation of the data set that is much smaller in volume but yet produces the same analytical results. Data reduction is the transformation of numerical or alphabetical digital information derived empirically or experimentally into a corrected, ordered, and simplified form

Data reduction Strategies:

  • Data Compression
  • Dimensionality reduction
  • Discretization and concept hierarchy generation
  • Numerosity reduction
  • Data cube aggregation.


[1] Tomar, Divya, and Sonali Agarwal, “A survey on pre-processing and post-processing techniques in data mining”, International Journal of Database Theory and Application 7, no. 4 (2014): pp. 99-128.

[2] Bilquees Bhagat, “Data pre-processing techniques in data mining”, September 2, 2017, available online at: https://cloudera2017.wordpress.com/2017/09/02/1182/

[3] “Data Preprocessing”, available online at: http://www.comp.dit.ie/btierney/BSI/Han%20Book%20Ch3%20DataExploration.pdf

Ant Colony Optimization Algorithm in Machine Learning

There are even increasing efforts in searching and developing algorithms that can find solutions to combinatorial optimization problems. In this way, the Ant Colony Optimization Metaheuristic takes inspiration from biology and proposes different versions of still more efficient algorithms.


Ant Colony Optimization (ACO) is a paradigm for designing metaheuristic algorithms for combinatorial optimization problems. The essential trait of ACO algorithms is the combination of a priori information about the structure of a promising solution with a posteriori information about the structure of previously obtained good solutions.

ACO is a class of algorithms, whose first member, called the Ant System, was initially proposed by Colorni, Dorigo, and Maniezzo The main underlying idea, loosely inspired by the behavior of real ants, is that of a parallel search over several constructive computational threads based on local problem data and on a dynamic memory structure containing information on the quality of the previously obtained result. The collective behavior emerging from the interaction of the different search threads has proved effective in solving combinatorial optimization (CO) problems.

More specifically, we can say that “Ant Colony Optimization (ACO) is a population-based, general search technique for the solution of difficult combinatorial problems which is inspired by the pheromone trail laying behavior of real ant colonies.”

ACO Principle

Ant Colony Optimization principles are based on the natural behavior of ants. In their daily life, one of the tasks ants have to perform is to search for food, in the vicinity of their nest. While walking in such a quest, the ants deposit a chemical substance called pheromone in the ground.

At first, the ants wander randomly. When an ant finds a source of food, it walks back to the colony leaving “markers” (pheromones) that show the path has food. When other ants come across the markers, they are likely to follow the path with a certain probability. If they do, they then populate the path with their own markers as they bring the food back. As more ants find the path, it gets stronger until there are a couple of streams of ants traveling to various food sources near the colony. Because the ants drop pheromones every time they bring food, shorter paths are more likely to be stronger, hence optimizing the “solution.”

Ant Colony Optimization (ACO) is an example of how inspiration can be drawn from seemingly random, low-level behavior to counter problems of great complexity. A specific focus lies on the collective behavior of ants after being confronted with a choice of path when searching for a food source (see Figure 1). Ants deposit pheromones on the ground having selected a path, with the result that fellow ants tend to follow the path with a higher pheromone concentration. This form of communication allows ants to transport food back to their nest in a highly effective manner. Following random fluctuations, one of the bridges presents a higher pheromone concentration. Eventually, the entire colony converges toward the use of the same bridge.

Figure 1: This shows the potential paths a colony can take from nest (N) to food (F), with the route eventually converging as a result of random fluctuations in pheromone deposits.

ACO Algorithm

The ant colony algorithm is an algorithm for finding optimal paths that are based on the behavior of ants searching for food. Here we are presenting an ACO algorithm

Table 1: ACO Algorithm


[1] Paul Sharkey, “Ant Colony Optimization: Algorithms and Applications”, March 6, 2014, available online at: http://www.lancaster.ac.uk/pg/sharkeyp/Topic1.pdf

[2] Maniezzo, Vittorio, and Antonella Carbonaro, “Ant colony optimization: an overview”, In Essays and surveys in metaheuristics, pp. 469-492, Springer, Boston, MA, 2002.

[3] Parsons, Simon, “Ant Colony Optimization by Marco Dorigo and Thomas Stützle, MIT Press, 305, The Knowledge Engineering Review 20, no. 1 (2005): 92.

What is Business Intelligence

Every business is dynamic in nature and is affected by various external and internal factors. These factors include external market conditions, competitors, internal restructuring and re-alignment, operational optimization, and paradigm shifts in the business itself. New regulations and restrictions, in combination with the above factors, contribute to the constant evolutionary nature of compelling, business-critical information; the kind of information that an organization needs to sustain and thrive. Business Intelligence (“BI”) is a broad term that encapsulates the process of gathering information pertaining to a business and the market it functions in.

What is it?

Business intelligence (BI) has two basic different meanings related to the use of the term intelligence. The primary, less frequently, is the human intelligence capacity applied in business affairs/activities. Intelligence of Business is a new field of investigation of the application of human cognitive faculties and artificial intelligence technologies to the management and decision support in different business problems. The second relates to intelligence as information valued for its currency and relevance. It is expert information, knowledge, and technologies efficient in the management of organizational and individual business. Therefore, in this sense, business intelligence is a broad category of applications and technologies for gathering, providing access to, and analyzing data for the purpose of helping enterprise users make better business decisions

Business Intelligence (BI) is a vital subject that covers a vast area of interest for today’s businessmen. BI consists of both internal and external categories that deal with the ability of a company to determine what its competitors are doing as well as understand what forces may be at work against them. Finally, how does your business incorporate the data that it collects into useful information yielding a competitive advantage? The field of BI is frequently murky and can easily cross the confused boundaries of business ethics as well as federal law. Using current academic literature, case studies, and an interview with a BI provider, we have outlined the key aspects of BI that your business needs to understand in today’s competitive environment.

Figure 1: Business Intelligence Cycle

Defining BI

The term Business Intelligence (BI) refers to technologies, applications, and practices for the collection, integration, analysis, and presentation of business information. The purpose of Business Intelligence is to support better business decision-making. Essentially, Business Intelligence systems are data-driven Decision Support Systems (DSS). Business Intelligence is sometimes used interchangeably with briefing books, report and query tools, and executive information systems.

“Business Intelligence is the art of gaining a business advantage from data by answering fundamental questions, such as how various customers rank, how business is doing now and if continued the current path, what clinical trials should be continued and which should stop having money dumped into!”

With a strong BI, companies can support decisions with more than just a gut feeling. Creating a fact-based “decisioning” framework via a strong computer system provides confidence in any decisions made.

Business Intelligence (BI) is a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information which can be used to enable more effective strategic, tactical, and operational insights and decision-making. Within this are included a variety of technologies, including data quality and master data management.

BI Examples

Business intelligence (BI) is the use of data analysis in taking strategic decisions in the business environment. This definition might seem somewhat abstract if it is not illustrated with some concrete examples

Stock Optimization

Sectors with a pronounced seasonal business cycle often find it very difficult to optimize their stock. For example, if sales of particular product shoot up in summer or at Christmas, it is a big challenge to store the right amount of stock in order to maximize profit.

To address this issue, some companies in the canning, preserving and general food sector have been able to increase profitability by nearly 10% using BI techniques based on:

  • The adoption of a decision support system (DSS).
  • The exhaustive analysis of historical sales and stocktaking data for warehouse products.

In many cases, the results obtained have made possible a much more efficient and profitable redesign of the entire logistical and productive warehousing process.

Increasing Customer Loyalty

Business intelligence processes are also very useful for identifying the most profitable customers of, for example, a supermarket or clothing chain, who can subsequently be brought into loyalty schemes.

To do this, a great deal of data must be correctly analysed in order to find the ideal profile: age, sex, geographical location, marital status, number of children, etc. A good way of obtaining this information might be the creation of “discount cards”, where in exchange for a card; the client has to provide a range of personal details.

Detecting and Correcting Budget Deviations

There are plenty of companies, especially large ones that are affected by significant budget deviations, discrepancies between the estimated operational parameters and targets set at the beginning of the year, and the actual results produced twelve months later.

An analysis of the strategic objectives of the company itself by means of a Balance Scorecard can quickly detect the reason for these deviations and enable their rapid correction. Sometimes, the problem might be a mismatch between a company’s advertising and marketing operations and its real needs.

Problems for Small Businesses

The view that BI is only of any use to large companies is as widely-held, as it is wrong. Simple business intelligence systems can be of great help to small businesses in deciding, for example, what the best opening hours are, or what day of the week is best to take off.

BI Vendors

The BI market is in constant flux. New vendors frequently appear, and just as frequently disappear or become acquired by a larger company. Following are the BI Vendors are listed here:

Cloud-based 1010 data provides big data discovery options within the same location where it is stored, speeding up important business decisions by giving all users easier, quicker access requiring fewer clicks.

Actuate’s BIRT business intelligence software, known for its focus on open source, utilizes an Eclipse platform to streamline reports and help generate useful insights with three unique types of reporting tools.

Alteryx’s BI platform is powered by a unique data blending capability, which seamlessly unites cloud data, third-party data, and internal company information, creating a smoother, more efficient workflow.

Arcplan offers their customers two platforms as a way to deliver Business Intelligence functionality – the Enterprise platform and engaging platform and can integrate with other BI tools.

Birst’s BI solutions include a wide variety of features, such as big data, data warehouse automation, and data mashups. Users can choose between two platforms based on their needs.


[1] Dejan Zdraveski and Igor Zdravkoski, “Business Intelligence Tools for Data Analysis and Decision Making”, available online at: file:///C:/Users/maxtech-10/Downloads/cks_2011_economy_art_006.pdf

[2] Jayanthi Ranjan, “Business Intelligence: Concepts, Components, Techniques and Benefits”, Journal of Theoretical and Applied Information Technology, Volume 9, Number 1, pp 060 – 070, 2005-2009

[3] Greg Nelson, “Introduction to the SAS® 9 Business Intelligence Platform: A Tutorial”, In SAS Global Forum. 2007.

[4] Captio, “Some practical examples of the use of business intelligence”, available online at: https://www.captio.com/blog/some-practical-examples-of-the-use-of-business-intelligence

[5] Justin Heinze, “The Ultimate List of Business Intelligence Vendors”, available online at: https://www.betterbuys.com/bi/business-intelligence-vendors/

What is Community Detection

Community detection is one of the most relevant topics to the machine learning technique clustering. The community term is being used to indicate a group of similar objects based on their differential behaviors.  However, Advances in technology and computation have provided the possibility of collecting and mining a massive amount of real-world data. Mining such “big data” allows us to understand the structure and function of real systems and to find unknown and interesting patterns. This section provides a brief overview of the community structure.


In the actual interconnected world, and with the rising of online social networks graph mining and community detection become completely up-to-date. Understanding the formation and evolution of communities is a long-standing research topic in sociology in part because of its fundamental connections with the studies of urban development, criminology, social marketing, and several other areas. With the increasing popularity of online social network services like Facebook, the study of community structures assumes more significance. Identifying and detecting communities are not only of particular importance but have immediate applications. For instance, for effective online marketing, such as placing online ads or deploying viral marketing strategies [1], identifying communities in social networks could often lead to more accurate targeting and better marketing results. Albeit online user profiles or other semantic information is helpful to discover user segments this kind of information is often at a coarse-grained level and overlooks community structure that reveals rich information at a fine-grained level.


Many real-world complex systems, such as social or computer networks can be modeled as large graphs, called complex networks. Because of the increasing volume of data and the need to understand such huge systems, complex networks have been extensively studied over the last ten years. Communities are clearly overlapping in real-world systems, especially in social networks, where every individual belongs to various communities: family, colleagues, groups of friends, etc. Finding all these overlapping communities in a huge graph is very complex: in a graph of nodes, there are  2n such possible communities and such possible community structures. Even if these communities could be efficiently computed, it may lead to uninterruptable results. Because of the complexity of overlapping communities’ detection, most studies have restricted the community structure to a partition, where each node belongs to one and only one community [14].

Identifying network communities can be viewed as a problem of clustering a set of nodes into communities, where a node can belong to multiple communities at once. Because nodes in communities share common properties or attributes, and because they have many relationships among themselves, there are two sources of data that can be used to perform the clustering task. The first is the data about the objects (i.e., nodes) and their attributes. Known properties of proteins, users’ social network profiles, or authors’ publication histories may tell us which objects are similar, and to which communities or modules they may belong. The second source of data comes from the network and the set of connections between the objects. Users form friendships, proteins interact, and authors collaborate [14].


Community detection is a key to understanding the structure of complex networks, and ultimately extracting useful information from them. An excessively studied structural property of real-world networks is their community structure. The community structure captures the tendency of nodes in the network to group together with other similar nodes into communities. This property has been observed in many real-world networks. Despite excessive studies of the community structure of networks, there is no consensus on a single quantitative definition for the concept of community and different studies have used different definitions. A community, also known as a cluster, is usually thought of as a group of nodes that have many connections to each other and few connections to the rest of the network. Identifying communities in a network can provide valuable information about the structural properties of the network, the interactions among nodes in the communities, and the role of the nodes in each community [15].

In the clustering framework, a community is a cluster of nodes in a graph, but a very important question is what a cluster is. but most of the time, the objects in a cluster must be more similar than objects out of this cluster: the objects are clustered or grouped based on the principle of maximizing the intra-class similarity and minimizing the inter-class similarity. Let us remark that, this definition implies the necessity to define the notions of similarity measure and/or cluster fitness measure

Figure 1: Social Network Community Structure

Community detection has been widely used in social network analysis to study the behavior and interaction patterns of people in social networks. Community detection is also important to identify powerful nodes in the network, based on their structural position to initiate influential campaigns.


[12] Chayant Tantipathananandh, “Detecting and Tracking Communities in Social Networks”, Dissertation Northwestern University, 2013

[2] J. Chang and D. M. Blei, Relational topic models for document networks. In AISTATS ’09, 2009

[3] Clara Granell, Sergio G´omez and Alex Arenas, “Data clustering using community detection algorithms”, Int. J. Complex Systems in Science volume 1 (2011), pp. 21–24

[4] S. Fortunato, “Community detection in graphs”, Physics Reports, vol. 486, no. 3-5, pp. 75 – 174, 2010, online available at: http://www.sciencedirect.com/science/article/B6TVP-4XPYXF1- 1/2/99061fac6435db4343b2374d26e64ac1

Exit mobile version