SMART ENERGY. New technologies in the energy sector.
23 September 2019eCommerce – encyclopedia of terms
15 November 2019NLP – Processing of natural language in a way that is easy to understand by machines. Stopping for a moment in the hustle and bustle of our age, it’s very easy to see how powerful the role of artificial intelligence is. How it changes the way we look at the world today. The result of such a big development is the creation of machines and computer programs that accurately reproduce man’s way of thinking.
One of the newest technologies of artificial intelligence that brings us closer to the aforementioned purpose is Natural Language Processing.
What is NLP and how does it work?
NLP is an area of research on artificial intelligence and one of its tasks is the interaction between computer and human beings in a natural way. In essence, this technology helps to understand and communicate with human speech by means of machine tools, such as learning algorithms. It consists of a linguistic module, which is based on the analysis of natural language, in which we distinguish seven levels: phonology, lexis, morphology, syntactics, semantics, pragmatics and discourse
The basic job of NLP is the separation of sentences. The second stage is to divide them into tokens. They make it possible to divide the text into simple pieces, i.e. numbers, words, scores. It depends on the language in which the given content has been created and what is it about. To carry out this process for English or German is quite simple, but the situation gets more complicated with languages like Polish, where the text requires a complex analysis, due to complex grammar. An example of such process is splitting the word “dark-red” into two tokens “dark” and “red”.
The lack of equivalence between texts is also an important consideration. The literary text is different from the text in a public newspapers, or the scientific paper containing additional elements such as: graphs, diagrams and chemical, physical or mathematical formulas. Next step is to normalize the text, i.e. to replace the word “dwa”, “dwie” (different forms of word “two” in Polish), etc. with “2” and to analyze any Named Entities. Going further there is the so-called unification of word forms, i.e. lemmatization and stemming, in which various forms of the analyzed word are checked. Stemming is designed to remove all prefixes and suffixes. Lemmatization is a difficult process, where the word is reduced to its basic form. Finally there is the speech recognition that includes identification of speech parts and parsing which uses semantic information for syntax analysis.
Analysis of text documents
The system that analyses documents receives the collected data, usually in a text form: .txt or .html. We divide IT tools for text analysis into:
- Simple and straightforward to obtain first-class statistics in documents e.g. frequency of word occurrences (e.g. TextSTAT, AntConc)
- Indexing and search engines (e.g. Google, Yahoo, Windows Deskop Search)
- Advanced, i.e. those that allow for the analysis of multifaceted text, using techniques such as visualisation of results (e.g. SAS Text, Text Garden)
Natural Language Processing uses algorithms designed to identify and isolate the rules of natural language in a way that allows for unstructured data to be converted into a form comprehensible for computers. Afterwards computer uses an algorithm to collect the necessary data and extract meaning from each sentence. The processing of natural language already accompanies us in our everyday lives. Recognizing, processing textual information and generating speech is increasingly used, among other things, to help learn foreign languages, for automatic generation of translations, abstracts, applications or robotics. Examples of applications are: language translation applications such as Google Translate, Chatbots, or personal assistants such as Google Assistant or Amazon’s Alexa.
NLP models and algorithms
The basic step in NLP is determined by the application of the system. Systems based on voice (Google Assistant) translate speech into text.This is done using a Hidden Markov Model (HMM). That uses speech clips with a length of approximately 10 to 20 milliseconds to search for phonemes in order to compare them with the previously recorded speech.
NLP techniques are mainly based on syntactic and semantic analysis. Syntactic analysis is used for assessing conformity of natural language with grammatical rules. Semantic analysis, on the other hand, helps to create a framework for natural language processing and explains how NLP and artificial intelligence interpret human sentences. To sum up, while the HMM method divides sentences into basic structures, semantic analysis helps to add content.
How NLP functions in business
Natural Language Processing is becoming more and more popular in business as it is used on a daily basis and has been in use for few years by now, e.g. spelling checks, online searches. An example of NLP that has recently been at the forefront are chatbots. A great solution, allowing the company to maintain constant contact with the customer. Many company bots optimize their work, e.g. in the HR department, automatically responding to questions from employees or candidates applying for a job. With such solutions it is possible to categorise queries and ensure a fast response process. Another example is the application, which is well known by many smartphone users, that is SwiftKey, predicting and suggesting the most commonly used words while writing. Artificial intelligence, machine learning and natural language processing are the pioneers of innovation in the world of technology. In the blink of an eye, we can see newer and better technological solutions based on intelligent systems, which play a crucial role in business. Tools such as mood analysis help companies to find out whether a post is good or bad and classify customer problems.
What is NLP in practice?
An example of a practical understanding of this process is when a program analyzes a sentence “I have a collection of pens” (in polish words “pen” and “feather” are one word). It needs context in order to understand whether this text refers to bird feathers or writing pens.
Processing of natural language is a major challenge for scientists. The problem lies in the nature of human language, or in the ambiguity of its expressions. The rules we use daily are not easy to understand and analyze by computers.
Natural language processing is used in smart homes, cars, as well as in speech synthesizers that help people with disabilities, partially sighted or partially deaf, to communicate with the outside world. NLP is one of the most important technologies for knowledge management, because it gives the possibility for automatic processing of content (e.g. www), searching for knowledge elements and enhancing the opportunity to learn about new elements of reality, such as statements, relations or concepts.
Artificial intelligence and all its forms that make it up should be seen by us as the next step in the evolution of humanity. Immersing into the area of artificial intelligence, we discover beauty and fascination which were present only in the dreams of many scientists just two decades ago.