You are reading the article Prominent Applications Of Natural Language Processing In Communication updated in December 2023 on the website Bellydancehcm.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Prominent Applications Of Natural Language Processing In Communication
Natural Language Processing makes the semantics of human language comprehendible to the systems, devices, and machines. Communication is the key to progress! This phrase has been regaled innumerable times, to establish a successful business. Be it an employee-employer relationship or owner-client/customer relationship, communication is the driving force behind successful decision making. Over the years, communication has been enthralled with innovations in technology. From the discovery of telephone to the integration of voice recognition technique into Amazon’s Alexa, the technology has revolutionized how humans communicate. Moreover, the key to good communication is accomplished by understanding the complexities of the language. Language is an inherent behavior by living organisms and includes semantics such as words, signs or images. A human being, while reaching adolescence becomes well-versed about understanding the different aspects of communication. But modern technology-driven devices require an immense amount of learning and training before they can understand the semantics of the language. With Artificial Intelligence technologies and Machine Learning models, this task becomes easier. Specifically, the AI technology,
Emailing FiltersThe amount of mails anyone receives is overwhelming. While some e-mails are business-related, others are only sent for promotional purposes. With the help of
Smart Assistants“Hey Siri”! This term is the trademark of Apple’s iPhone. With such an invasive voice recognition technology, this phrase is popular amongst human beings, whether they are an iPhone user or not. But have we ever paused for a moment, how Apple’s Siri, or Amazon’s Alexa, answers all our questions with precision? Or how they comprehend human language with a prompt response? The answer to these questions is the integration of
Search ResultsWhile promoting company profiles, articles, blogs or even making a website, the emphasis is given to the search results. Also, many job applications demand search results or search engine optimization to be the priority. And while this word has created the buzz for increasing customer’s engagement and is a key to marketing, it becomes imperative to know the “What” of this application. The “What” of this application is the integration of NLP into the system, which surfaces relevant results based on the behaviors and semantics of language it is trained on. An example of this would be Google’s Search option, which understands the query of the user based on the few words that they have typed.
Predictive TextAutocorrect! We use simple words numerous times in a day while using any device. We take this simple application for granted without even realizing the science behind such application. The predictive analytics of NLP is the reason behind the devices’ familiarity with autocorrect. They predict things based on the semantics they are trained in and will either finish the word or suggest a relevant one. Moreover, they allow the user to customize their language preferences and learn from them.
Language TranslationNatural Language Processing makes the semantics of human language comprehendible to the systems, devices, and machines. Communication is the key to progress! This phrase has been regaled innumerable times, to establish a successful business. Be it an employee-employer relationship or owner-client/customer relationship, communication is the driving force behind successful decision making. Over the years, communication has been enthralled with innovations in technology. From the discovery of telephone to the integration of voice recognition technique into Amazon’s Alexa, the technology has revolutionized how humans communicate. Moreover, the key to good communication is accomplished by understanding the complexities of the language. Language is an inherent behavior by living organisms and includes semantics such as words, signs or images. A human being, while reaching adolescence becomes well-versed about understanding the different aspects of communication. But modern technology-driven devices require an immense amount of learning and training before they can understand the semantics of the language. With Artificial Intelligence technologies and Machine Learning models, this task becomes easier. Specifically, the AI technology, Natural Language Processing , is integrated into every system of communication, to make the task feasible. In this article, we will observe the prominent applications of Natural Language Processing (NLP) in chúng tôi amount of mails anyone receives is overwhelming. While some e-mails are business-related, others are only sent for promotional purposes. With the help of natural language processing (NLP), these emails can be categorised as primary, social or promotions. Additionally, with the adaptation of NLP, spam filters can be integrated into the system depending upon the semantics of the language. This is already used with Gmail, where it makes the inbox manageable.“Hey Siri”! This term is the trademark of Apple’s iPhone. With such an invasive voice recognition technology, this phrase is popular amongst human beings, whether they are an iPhone user or not. But have we ever paused for a moment, how Apple’s Siri, or Amazon’s Alexa, answers all our questions with precision? Or how they comprehend human language with a prompt response? The answer to these questions is the integration of Natural Language processing in Apple’s Siri and Amazon’s Alexa. With the help of NLP, the applications like Alexa and Siri picks up the contextual clues with the ML model that assists them in answering the questions with promptness and precision. Tech experts suggest that in the distant future, NLP will be the world where humans will live in.While promoting company profiles, articles, blogs or even making a website, the emphasis is given to the search results. Also, many job applications demand search results or search engine optimization to be the priority. And while this word has created the buzz for increasing customer’s engagement and is a key to marketing, it becomes imperative to know the “What” of this application. The “What” of this application is the integration of NLP into the system, which surfaces relevant results based on the behaviors and semantics of language it is trained on. An example of this would be Google’s Search option, which understands the query of the user based on the few words that they have typed.Autocorrect! We use simple words numerous times in a day while using any device. We take this simple application for granted without even realizing the science behind such application. The predictive analytics of NLP is the reason behind the devices’ familiarity with autocorrect. They predict things based on the semantics they are trained in and will either finish the word or suggest a relevant one. Moreover, they allow the user to customize their language preferences and learn from chúng tôi businesses do not need a Spanish Thesaurus to translate what the client is saying in Spanish into English. By integrating NLP into devices or applications, the online translators automatically translate the language accurately. An example of this would be Google’s Keyboard in mobile which makes the translation task easier.
You're reading Prominent Applications Of Natural Language Processing In Communication
Text Mining Vs Natural Language Processing
Difference between Text Mining and Natural Language Processing
Start Your Free Data Science Course
Hadoop, Data Science, Statistics & others
Head To Head Comparison Between Text Mining and Natural Language Processing (Infographics)Below is the top 5 Comparison between Text Mining and Natural Language Processing:
Key Differences between Text Mining and Natural Language ProcessingBelow is the difference between Text Mining and Natural Language Processing:
Application – Concepts from NLP are used in the following basic systems:
Speech recognition system
Question answering system
Translation from one specific language to another specific language
Text summarization
Sentiment analysis
Template-based chatbots
Text classification
Topic segmentation
Human robots understand natural language commands and interact with humans in natural language.
Building a universal machine translation system is the long-term goal in the NLP domain
It generates the logical title for the given document.
Generates meaningful text for specific topics or for an image given.
Advanced chatbots, which generate personalized text for humans and ignore mistakes in human writing
Popular applications of Text Mining :
Contextual Advertising
Content enrichment
Social media data analysis
Spam filtering
Fraud detection through claims investigation
Development life cycle –
The general development process will have the following steps for developing an NLP system.
Understand the problem statement.
Decide what kind of data or corpus you need to solve the problem. Data collection is an essential activity for solving the problem.
They are analyzing the collected corpus. What is the quality and quantity of the canon? According to the quality of the data and problem statement, you need to do preprocessing.
Once done with preprocessing, start with the process of feature engineering. Feature engineering is the most critical aspect of NLP and data science-related applications. Different techniques like parsing and semantic trees are used for this.
Now, depending on what techniques you are going to use, you should read the feature files that you are going to provide as input to your decision algorithm.
Run the model, test it, and finetune it.
Iterate through the above step to get the desired accuracy.
Most of the time, Text Mining analyzes the text as such, which does not require a reference corpus as in NLP. In the data collection part, external corpus requirement is infrequent.
Basic feature engineering for Text Mining and Natural Language Processing. Techniques like n-grams, TF – IDF, Cosine Similarity, Levenshtein Distance, and Feature Hashing are most popular in Text Mining.
As mentioned earlier, system accuracy is measurable here, so Running, testing, and Finetune iteration of a model is relatively easy in Text Mining.
Unlike the NLP system, Text Mining systems will have a presentation layer to present mining findings. This is more of an art than engineering.
Future Work – With the increased Internet use, text mining has become increasingly important. New specialized fields, such as web mining and bioinformatics, are emerging. Currently, most of the data mining work lies in data cleaning and data preparation, which is less productive. Active research is happening to automate these works using Machine learning.
NLP is improving every day, but a natural human language is difficult to tackle for machines. We express jokes, sarcasm, and every sentiment quickly, and every human can understand it. We are trying to solve it using an ensemble of deep neural networks. Currently, many NLP researchers focus on automated machine translation using unsupervised models. Natural Language Understanding(NLU) is another field of interest that has a significant impact on Chatbots and humanly understandable robots.
Text Mining vs Natural Language Processing Comparison TableBelow are the lists of points that describe the comparisons between Text Mining and Natural Language Processing.
Basis of Comparison Text mining NLP
Goal Extract high-quality information from unstructured and structured text. Information could be patterned in text or matching structure, but the semantics in the text is not considered. Trying to understand what is conveyed in natural language by humans- may be text or speech. Semantic and grammatical structures are analyzed.
Tools
Text processing languages like Perl
Statistical models
ML models
Advanced ML models
Deep Neural Networks
Toolkits like NLTK in Python
Scope
Extracting representative features for natural language documents
Input for a corpus-based computational linguistics
The data source can be any form of natural human communication method like text, speech, signboard, etc
Extracting semantic meaning and grammatical structure from the input
Making all level of interaction with machines more natural for human
Outcome 3. Correlation within words 3. Grammatical structure
System Accuracy A performance measure is direct and relatively simple. Here we have clearly measurable mathematical concepts. Measures can be automated. Highly difficult to measure system accuracy for machines. Human intervention is needed most of the time. For example, consider an NLP system that translates from English to Hindi. Automating the measure of how accurately the system doing the translation is difficult.
ConclusionBoth Text Mining vs Natural Language Processing try to extract information from unstructured data. NLP tries to get semantic meaning from all means of natural human communication, like text, speech, or even an chúng tôi has the potential to revolutionize the way humans interact with chúng tôi Echo and Google Home are some examples.
Recommended ArticlesWe hope that this EDUCBA information on “Text Mining vs Natural Language Processing” was beneficial to you. You can view EDUCBA’s recommended articles for more information.
Role Of Microbes In Food Processing
Introduction
Microbiology is the study of different microbes such as bacteria, viruses, slime molds, fungi, and protozoans. Microbes can be unicellular, multicellular, or acellular. The study of microorganisms that modify, colonize, contaminate or process food is known as food microbiology.
History of Food Microbiology
Food history varies with culture, environment, and social and economic impacts. The history of Food Microbiology is classified based on time and periods. In 7000 BC there was evidence of the manufacturing of beer.
The wine was found to be manufactured in about 3500 BC. For the first time, food spoilage was recorded in 6000 BC.
Around 3000 BC, Egyptians started manufacturing cheese and butter.
To prevent the spoilage of food, snow was used to preserve shrimp in 1000 BC.
Food Processing in Households Curd
Lactobacillus is known as Lactic acid bacteria (LAB).
It grows in milk and converts it into curd.
LAB works by producing lactic acid that will coagulate the milk and will also partially digest milk proteins.
small quantity of curd will act as inoculum of LAB. When this inoculum is added to fresh milk and kept at a suitable temperature, it will multiply and convert milk into curd.
LAB increases the nutritional value of curd by increasing the Vitamin B12 in it.
Curd also acts as a probiotic for our stomach and checks the diseasecausing organism.
Cheese
Cheese is one of the oldest items in which microbes were used.
Different varieties of cheese are available based on flavour, texture, and taste. These characteristics are dependent on the type of microbes used.
Swiss cheese contains large holes in it due to the bacterium Propionibacterium shermanii which produces a large amount of carbondioxide.
Cheese are classified into the following types −
Type of Cheese Microorganism Used
Soft
Camembert cheese
Penicillium camemberti
Semi-hard
Roquefort cheese
Penicillium roqueforti
Hard
Swiss cheese
Propionibacterium shermanii
Other products used in Household
The dough, used for making dosa and idli are also fermented by bacteria. These foods get puffed up due to the release of carbon dioxide in the dough.
Dough used for making bread in bakeries is fermented with Saccharomyces cerevisiae, also known as baker’s yeast.
One of the most common traditional drinks. Today is obtained from the sap of palms and is also fermented by microbes.
Microbes in Industrial ProductsMicrobes are widely used in different parts of industries for the benefit of living organisms.
Alcoholic Beverages/Fermented Beverages
For years, yeast had been used for the production of wine, beer, brandy, whiskey, or rum.
Saccharomyces cerevisiae used for bread-making is the same yeast used for the preparation of fermented beverages. This yeast is also known as brewer’s yeast used for fermenting malted cereals and fruit juices, in order to produce ethanol.
Based on the type of raw material used for fermentation, and the type of processing used, different types of alcoholic beverages are obtained.
Other products obtained from yeast fermentation are as follows
Beer obtained from Barley (Hordeum vulgare) malt, in which alcohol content is just 3 to 6 percent.
Grapes are used to obtain wine and alcohol content is 10 to 20 percent.
Distillation of wine and alcohol is used to obtain Brandy and alcohol content is 60 to 70 percent.
European Rye cereal is sued to obtain Gin in which alcohol content is 40 percent.
Molasses of sugarcane is used to obtain Rum and the alcohol content is 40 percent.
Production of AntibioticsThe term antibiotic was given by Selman Walksman in 1942. They are considered remarkable discoveries for the welfare of human beings. The word antibiotic means something that fights against the living or against life in terms of disease causing organisms, and for humans, they prove to be beneficial.
The first antibiotic discovered was Penicillin. It was discovered by Alexander Fleming. It was a by-chance discovery. He was working on Staphylococcus bacteria growing on unwashed culture plates. These bacteria did not grow because there was a mould growing in culture plate which releases a chemical that inhibits the growth of bacteria. He named this mould Penicillin with a later name Penicillium notatum.
Antibiotics have greatly improved the ability to fight different deadly diseases such as Whooping cough, plague, diptheria, leprosy, etc., which earlier killed millions of people worldwide.
Organic AcidFermentation activities of certain bacteria and fungi are used to obtain organic acids.
Citric acid is obtained by anaerobic fermentation of sucrose. It is used as flavouring extract, malek medicines, dying, food, and candies ink. Fungus Aspergillus niger is used to obtain citric acid.
Acetic acid or vinegar is used is obtained by the two-step fermentation process. Firstly, carbohydrates are converted into alcohol by alcoholic fermentation and in the second step, alcohol is oxidized by Acetobacter aceti into acetic acid.
Butyric acid is obtained from Clostridium butylicum.
EnzymesEnzymes are another milestone obtained through different microbial activity.
The bottle juices are clarified by the use of pectinase and protease.
Tissue plasminogen activator (TPA) or streptokinase is produced from the bacterium Streptococcus and is used as a clot buster for removing clots from the blood vessels for patients who have undergone myocardial infarction.
ConclusionMicrobiology is the study of different microbes such as bacteria, viruses, slime molds, fungi, and protozoans. Food history varies with culture, environment, social, and economic impacts. The history of food microbiology is classified based on time and periods. Curd, cheese, dough, and other drinks use different microbes for their processing. Microbes have industrial applications, that is, for obtaining antibiotics, enzymes, alcoholic beverages, and organic acids.
FAQsQ1. What is Cyclosporin A?
Ans. Cyclosporin is an immunosuppressant obtained from the fungus Trichoderma polysporum. It is used during organ transplants to prevent the rejection of grafts.
Q2. What are statins?
Ans. Statins are obtained from the yeast Monascus purpureus. It inhibits the synthesis of cholesterol and is thus used as a blood cholesterol-lowering agent.
Q3. Explain how microbes are used as biofertilizers?
Ans. Rhizobium bacteria is used to fix atmospheric nitrogen into organic forms. They are cheap, environment friendly, replenish the soil nutrients, and help in organic farming.
Q4. What is single-cell protein?
Ans. Single-cell protein is an alternate source of protein for plants and animals. So, microbes can be used as a good source of proteins.
Applications Of Data Analytics In Hospitality
Most hospital industry players find it hard to attract new customers and convince them to come back again. It is important to develop ways to stand out from your competitors when working in a competitive market like the hospitality sector.
Client analytic solutions have proven to be beneficial recently since they detect problem areas and develop the best solution. Data analytics application in the hospitality industry has proven to increase efficiency, profitability, and productivity.
Data analytics assists companies to receive real-time insights that inform them where improvement is required, among others. Most companies in the hospitality sector have incorporated a data analytics platform to stay ahead of their rivals.
Below we discuss the applications of data analytics in hospitality.
1. Unified Client ExperienceMost customers use several gadgets when booking, browsing, and knowing more about hotels. This makes it essential to have a mobile-friendly app or website and make sure the customer can shift from one platform to the other easily.
The customer’s data should be readily accessible despite the booking method or the gadget used during the reservation. Companies that create a multi-platform, seamless customer experience not only enhance their booking experience but also encourage their customers to return.
2. Consolidates Date from Various ChannelsCustomers enjoy various ways to book rooms and other services, from discount websites to travel agents and direct bookings. It is essential to ensure your enterprise has relevant information concerning the customer’s reservation to provide the best service. This data can also be important for analytics.
3. Targeted Discounts and MarketingTargeted marketing is an important tool. Remember, not all guests are looking for the exact thing, and you might share information they are not concerned about by sending them the same promotions.
However, customer analytic solutions assist companies in sending every individual promotion they are interested in, which causes an improved conversion rate. Companies also use these analytics to target their website’s visitors, not just those on the email list.
4. Predictive Analysis
Predictive analysis is an important tool in most industries. This tool is the most suitable course of action for a company’s future projects, instead of simply determining how much a certain project has been successful.
These tools enable businesses to test various options before determining which one has a high chance of succeeding. Consider investing in robust analytics since it saves you significant money and time.
Also read:
Top 10 Successful SaaS Companies Of All Times
5. Develop Consistent ExperiencesThe best way to improve client satisfaction and loyalty is to ensure their data is more accessible to all brand properties. For example, if a hotel has determined former customers’ most common preferences and needs, they should make this information accessible to the entire chain.
This enables all hotels to maximize this information, which enables them to provide their customers with a seamless and consistent experience.
6. Enhances Revenue ManagementData analytics is important in the hospitality industry since it assists hoteliers in coming up with a way of handling revenue using the information acquired from different sources, like those found online.
Final Thoughts
More and more industries continue adopting data analytics due to its substantial benefits. The above article has discussed data analytics applications in the hospitality sector, and you can reach out for more information.
Let There Be Light: Exploring The World Of Lifi Communication
Let There Be Light: Exploring the World of LiFi Communication
In today’s fast-paced digital era, there has been a constant search for more efficient and secure ways to transmit information. Enter LiFi, or Light Fidelity – an emerging communication technology that uses LED lights as a gateway for data transfer.
Unlike its radio-wave-based counterpart, WiFi, LiFi harnesses visible light communication (VLC) in the electromagnetic spectrum for high-speed and secure bidirectional data transmission.
This breakthrough holds immense potential in revolutionizing wireless systems while significantly reducing electromagnetic interference.
Key Takeaways
LiFi, or Light Fidelity, is an emerging communication technology that uses visible light waves for high-speed and secure bidirectional data transmission.
With its unique properties of shorter wavelengths and higher frequencies, LiFi offers faster and more reliable connections with significantly reduced latency compared to traditional radio-frequency-based WiFi systems. It also reduces electromagnetic interference.
The potential applications of LiFi are vast – from integration with IoT devices and smart homes to healthcare settings and industrial environments. In these settings, it can provide a secure way for data transfer while reducing the risk of safety hazards caused by radio waves.
Looking ahead, the future of LiFi appears promising. Its integration with IoT devices and smart homes could open up new possibilities for automation in multiple sectors.
How LiFi Works and its Advantages?Li-Fi technology uses light waves in the electromagnetic spectrum to transmit data, resulting in high-speed and secure communication. It operates by modulating the intensity and frequency of an LED light source to encode data, which is then picked up by a photodetector on the receiving end.
Use of Light Waves for Data TransmissionLiFi communication systems capitalize on the unique properties of light waves to transmit data, making it a ground-breaking innovation in wireless technology. Light-emitting diodes (LEDs), which are not only energy-efficient but also capable of emitting modulated light at high speeds, form the backbone of LiFi’s data transmission infrastructure.
Additionally, since LiFi operates within a defined area illuminated by an LED source, it is far less susceptible to signal interference or eavesdropping compared to Wi-Fi networks.
High-Speed and Secure Data TransferLi-Fi technology offers a high-speed and secure data transfer solution. With the use of light waves in the electromagnetic spectrum, Li-Fi can achieve incredible data transmission speeds of up to 224 gigabits per second, which is much faster than Wi-Fi.
What’s more, since Li-Fi uses visible light waves for data transfer, it does not face radio interference or suffer from issues with signal strength as older wireless technologies did.
It also has the potential to be more secure compared to other wireless communication mediums because it cannot penetrate through walls and into neighbouring rooms like traditional Wi-Fi signals do.
Reduction in Electromagnetic InterferenceHowever, LiFi uses light waves instead of radio waves for data transmission, making it immune to those kinds of interferences.
LiFi could provide an effective solution for areas such as hospitals where medical equipment uses sensitive electronic mechanisms requiring precise location tracking. The high precision offered by Li-Fi aids in ensuring less electromagnetic interference from surrounding devices compared with traditional WiFi networks.
In conclusion, the reduction in electromagnetic inference makes LiFi a highly effective solution compared to Wi-Fi as a result of its immunity against disruptions caused by other devices emitting RF energy during operation.
Potential applications of LiFiLiFi has potential applications in various fields such as integration with IoT devices and smart homes, healthcare, industrial settings, and more.
Integration with IoT Devices and Smart HomesLi-Fi technology has the potential to integrate with IoT devices and smart homes, revolutionizing the way we interact with our surroundings. Here are some of the potential benefits and applications −
Smart lighting − Li-Fi enabled bulbs can act as data communication nodes, creating a high-speed network of interconnected bulbs that can be controlled remotely through a smartphone or other device.
Energy efficiency − Li-Fi technology offers energy-efficient solutions for IoT devices by leveraging existing lighting infrastructure. This allows for a reduction in power consumption, making it a viable solution for smart homes.
Indoor positioning system (IPS) − Li-Fi enables accurate indoor positioning, allowing users to track objects or people within a space using light beams. This can be useful in healthcare settings or industrial spaces, where precise tracking is necessary.
Secure communication − The use of light waves makes Li-Fi more secure than Wi-Fi as it cannot pass through walls and is not susceptible to radio interference or hacking attempts.
Increased bandwidth − With higher data transfer speeds compared to Wi-Fi, Li-Fi offers the potential for increased bandwidth in IoT devices and smart homes.
Overall, integration with IoT devices and smart homes could greatly enhance the capabilities and functionalities offered by Li-Fi technology.
Use in Healthcare and Industrial SettingsLi-Fi technology has various potential applications in healthcare and industrial settings, providing a more secure and efficient way to transfer data. Here are some ways Li-Fi can be used −
Medical Data Transfer – In the healthcare industry, Li-Fi can be utilized for high-speed and secure data transfer of medical records, images, and other sensitive information between doctors, hospitals, and clinics. This can lead to faster diagnoses and treatments.
Manufacturing Industry – Li-Fi can also be implemented in the manufacturing industry to provide high-speed communication between machines on production lines. It allows for real-time monitoring of production processes which could lead to increased efficiency, reduced downtime, and cost-effectiveness.
Hazardous Environments – In hazardous environments such as oil refineries or chemical plants where radio waves can cause interference with equipment or pose a fire hazard, Li-Fi technology could be a safer alternative.
Lighting Controls – Li-Fi-enabled lights can also act as a medium of communication with smart lighting systems that detect human presence in a room or location within the facility.
Precision Agriculture – The agriculture sector could also benefit from the use of Li-Fi technology by setting up networked LED systems that would aid in detecting nutrient deficiency or disease identification early on in crops’ growth process.
The versatility of Li-Fi technology opens up possibilities for its use across several industries beyond information and communications technology (ICT). From enhanced security concerns to precision agriculture monitoring capabilities, it is evident that this cutting-edge technology holds an array of untapped potential yet to be explored fully!
The Future of LiFi and its impact on Communication TechnologyThe future of LiFi technology appears very promising, and it is expected to have a significant impact on communication technology. With its high data transfer speeds and reduced interference from radio waves, it has the potential to revolutionize the way we communicate wirelessly.
One exciting application of LiFi technology is in smart lighting systems and IoT devices. The ability of LiFi-enabled devices to communicate with each other will lead to new possibilities for automation in homes, industries, and public spaces.
In healthcare settings specifically, Li-Fi could provide more secure communication channels between medical equipment whilst avoiding any safety issues associated with radio waves that health care personnel usually face from exposure over long periods during x-ray imaging procedures.
With all these potentials likes this already proven by various research worldwide including experiments run by Zumtobel group company’s Helvar at Monash University in Melbourne where they are creating an office network based entirely on light sources) among many others; there’s no telling what other developments might come up next – making us excited about a bright Lifi enabled future ahead!
ConclusionWith its potential integration with IoT devices and smart homes, as well as its application in healthcare and industrial settings, Li-Fi seems to offer endless possibilities.
Language Translation With Transformer In Python!
This article was published as a part of the Data Science Blogathon
IntroductionNatural Language Processing (NLP) is a field at the convergence of artificial intelligence, and linguistics. The aim is to make the computers understand real-world language or natural language so that they can perform tasks like Question Answering, Language Translation, and many more.
NLP has lots of applications in different fields.
1. NLP enables the recognition and prediction of diseases based on electronic health records.
2. It is used to obtain customer reviews.
3. To help to identify fake news.
4. Chatbots.
5. Social Media monitoring etc.
What is the Transformer?The Transformer model architecture was introduced by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser and Illia Polosukhin in their paper “Attention Is All You Need”. [1]
The Transformer model extracts the features for each word using a self-attention mechanism to know the importance of each word in the sentence. No other recurrent units are used to extract this feature, they are just activations and weighted sums, so they can be very efficient and parallelizable.
Source: ” Attention Is All You Need” paper
In the above figure, there is an encoder model on the left side and the decoder on the right. Both encoder and decoder contain a core block of attention and a feed-forward network repeated N number of times.
In the above figure, there is an encoder model on the left side and the decoder on the right. Both encoder and decoder contain a core block of attention and a feed-forward network repeated N number of times.
It has a stack of 6 Encoder and 6 Decoder, the Encoder contains two layers(sub-layers), that is a multi-head self-attention layer, and a fully connected feed-forward network. The Decoder contains three layers(sub-layers), a multi-head self-attention layer, another multi-head self-attention layer to perform self-attention over encoder outputs, and a fully connected feed-forward network. Each sub-layer in Decoder and Encoder has a Residual connection with layer normalization.
Let’s Start Building Language Translation ModelHere we will be using the Multi30k dataset. Don’t worry the dataset will be downloaded with a piece of code.
First the Data processing part we will use the torchtext module from PyTorch. The torchtext has utilities for creating datasets that can be easily iterated for the purposes of creating a language translation model. The below code will download the dataset and also tokenizes a raw text, build the vocabulary, and convert tokens into a tensor.
import math import torchtext import torch import chúng tôi as nn from torchtext.data.utils import get_tokenizer from collections import Counter from torchtext.vocab import Vocab from torchtext.utils import download_from_url, extract_archive from chúng tôi import pad_sequence from chúng tôi import DataLoader from torch import Tensor from chúng tôi import (TransformerEncoder, TransformerDecoder,TransformerEncoderLayer, TransformerDecoderLayer) import io import timetrain_urls = (‘train.de.gz’, ‘train.en.gz’) val_urls = (‘val.de.gz’, ‘val.en.gz’) train_filepaths = [extract_archive(download_from_url(url_base + url))[0] for url in train_urls] val_filepaths = [extract_archive(download_from_url(url_base + url))[0] for url in val_urls] test_filepaths = [extract_archive(download_from_url(url_base + url))[0] for url in test_urls] de_tokenizer = get_tokenizer(‘spacy’, language=’de_core_news_sm’) en_tokenizer = get_tokenizer(‘spacy’, language=’en_core_web_sm’) def build_vocab(filepath, tokenizer): counter = Counter() with io.open(filepath, encoding=”utf8″) as f: for string_ in f: counter.update(tokenizer(string_)) de_vocab = build_vocab(train_filepaths[0], de_tokenizer) en_vocab = build_vocab(train_filepaths[1], en_tokenizer) def data_process(filepaths): raw_de_iter = iter(io.open(filepaths[0], encoding=”utf8″)) raw_en_iter = iter(io.open(filepaths[1], encoding=”utf8″)) data = [] for (raw_de, raw_en) in zip(raw_de_iter, raw_en_iter): de_tensor_ = torch.tensor([de_vocab[token] for token in de_tokenizer(raw_de.rstrip(“n”))], dtype=torch.long) en_tensor_ = torch.tensor([en_vocab[token] for token in en_tokenizer(raw_en.rstrip(“n”))], dtype=torch.long) data.append((de_tensor_, en_tensor_)) return data train_data = data_process(train_filepaths) val_data = data_process(val_filepaths) test_data = data_process(test_filepaths) device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’) BATCH_SIZE = 128
Then we will use the PyTorch DataLoader module which combines a dataset and a sampler, and it enables us to iterate over the given dataset. The DataLoader supports both iterable-style and map-style datasets with single or multi-process loading, also we can customize loading order and memory pinning.
# DataLoader def generate_batch(data_batch): de_batch, en_batch = [], [] for (de_item, en_item) in data_batch: de_batch.append(torch.cat([torch.tensor([BOS_IDX]), de_item, torch.tensor([EOS_IDX])], dim=0)) en_batch.append(torch.cat([torch.tensor([BOS_IDX]), en_item, torch.tensor([EOS_IDX])], dim=0)) de_batch = pad_sequence(de_batch, padding_value=PAD_IDX) en_batch = pad_sequence(en_batch, padding_value=PAD_IDX) return de_batch, en_batch train_iter = DataLoader(train_data, batch_size=BATCH_SIZE, shuffle=True, collate_fn=generate_batch) valid_iter = DataLoader(val_data, batch_size=BATCH_SIZE, shuffle=True, collate_fn=generate_batch) test_iter = DataLoader(test_data, batch_size=BATCH_SIZE, shuffle=True, collate_fn=generate_batch)Then we are designing the transformer. Here the Encoder processes the input sequence by propagating it through a series of Multi-head Attention and Feedforward network layers. The output from this Encoder is referred to as memory below and is fed to the decoder along with target tensors. Encoder and decoder are trained in an end-to-end fashion.
# transformer class Seq2SeqTransformer(nn.Module): def __init__(self, num_encoder_layers: int, num_decoder_layers: int, emb_size: int, src_vocab_size: int, tgt_vocab_size: int, dim_feedforward:int = 512, dropout:float = 0.1): super(Seq2SeqTransformer, self).__init__() encoder_layer = TransformerEncoderLayer(d_model=emb_size, nhead=NHEAD, dim_feedforward=dim_feedforward) self.transformer_encoder = TransformerEncoder(encoder_layer, num_layers=num_encoder_layers) decoder_layer = TransformerDecoderLayer(d_model=emb_size, nhead=NHEAD, dim_feedforward=dim_feedforward) self.transformer_decoder = TransformerDecoder(decoder_layer, num_layers=num_decoder_layers) self.generator = nn.Linear(emb_size, tgt_vocab_size) self.src_tok_emb = TokenEmbedding(src_vocab_size, emb_size) self.tgt_tok_emb = TokenEmbedding(tgt_vocab_size, emb_size) self.positional_encoding = PositionalEncoding(emb_size, dropout=dropout) def forward(self, src: Tensor, trg: Tensor, src_mask: Tensor, tgt_mask: Tensor, src_padding_mask: Tensor, tgt_padding_mask: Tensor, memory_key_padding_mask: Tensor): src_emb = self.positional_encoding(self.src_tok_emb(src)) tgt_emb = self.positional_encoding(self.tgt_tok_emb(trg)) memory = self.transformer_encoder(src_emb, src_mask, src_padding_mask) outs = self.transformer_decoder(tgt_emb, memory, tgt_mask, None, tgt_padding_mask, memory_key_padding_mask) return self.generator(outs) def encode(self, src: Tensor, src_mask: Tensor): return self.transformer_encoder(self.positional_encoding( self.src_tok_emb(src)), src_mask) def decode(self, tgt: Tensor, memory: Tensor, tgt_mask: Tensor): return self.transformer_decoder(self.positional_encoding( self.tgt_tok_emb(tgt)), memory, tgt_mask)The Text which is converted to tokens is represented by using token embeddings. The Positional encoding function is added to the token embedding so that we can get the notions of word order.
class PositionalEncoding(nn.Module): def __init__(self, emb_size: int, dropout, maxlen: int = 5000): super(PositionalEncoding, self).__init__() den = torch.exp(- torch.arange(0, emb_size, 2) * math.log(10000) / emb_size) pos = torch.arange(0, maxlen).reshape(maxlen, 1) pos_embedding = torch.zeros((maxlen, emb_size)) pos_embedding[:, 0::2] = torch.sin(pos * den) pos_embedding[:, 1::2] = torch.cos(pos * den) pos_embedding = pos_embedding.unsqueeze(-2) self.dropout = nn.Dropout(dropout) self.register_buffer('pos_embedding', pos_embedding) def forward(self, token_embedding: Tensor): return self.dropout(token_embedding + self.pos_embedding[:token_embedding.size(0),:]) class TokenEmbedding(nn.Module): def __init__(self, vocab_size: int, emb_size): super(TokenEmbedding, self).__init__() self.embedding = nn.Embedding(vocab_size, emb_size) self.emb_size = emb_size def forward(self, tokens: Tensor): return self.embedding(tokens.long()) * math.sqrt(self.emb_size)Here in the below code, a subsequent word mask is created to stop a target word from attending to its subsequent words. Here the masks are also created, for masking source and target padding tokens.
def generate_square_subsequent_mask(sz): mask = (torch.triu(torch.ones((sz, sz), device=DEVICE)) == 1).transpose(0, 1) mask = mask.float().masked_fill(mask == 0, float('-inf')).masked_fill(mask == 1, float(0.0)) return mask def create_mask(src, tgt): src_seq_len = src.shape[0] tgt_seq_len = tgt.shape[0] tgt_mask = generate_square_subsequent_mask(tgt_seq_len) src_mask = torch.zeros((src_seq_len, src_seq_len), device=DEVICE).type(torch.bool) src_padding_mask = (src == PAD_IDX).transpose(0, 1) tgt_padding_mask = (tgt == PAD_IDX).transpose(0, 1) return src_mask, tgt_mask, src_padding_mask, tgt_padding_maskThen define the model parameters and instantiate the model.
SRC_VOCAB_SIZE = len(de_vocab) TGT_VOCAB_SIZE = len(en_vocab) EMB_SIZE = 512 NHEAD = 8 FFN_HID_DIM = 512 BATCH_SIZE = 128 NUM_ENCODER_LAYERS = 3 NUM_DECODER_LAYERS = 3 NUM_EPOCHS = 50 DEVICE = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') transformer = Seq2SeqTransformer(NUM_ENCODER_LAYERS, NUM_DECODER_LAYERS, EMB_SIZE, SRC_VOCAB_SIZE, TGT_VOCAB_SIZE, FFN_HID_DIM) for p in transformer.parameters(): nn.init.xavier_uniform_(p) transformer = transformer.to(device) loss_fn = torch.nn.CrossEntropyLoss(ignore_index=PAD_IDX) optimizer = torch.optim.Adam( transformer.parameters(), lr=0.0001, betas=(0.9, 0.98), eps=1e-9 )Define two different functions, that is for train and evaluation.
def train_epoch(model, train_iter, optimizer): model.train() losses = 0 for idx, (src, tgt) in enumerate(train_iter): src = src.to(device) tgt = tgt.to(device) tgt_input = tgt[:-1, :] src_mask, tgt_mask, src_padding_mask, tgt_padding_mask = create_mask(src, tgt_input) logits = model(src, tgt_input, src_mask, tgt_mask, src_padding_mask, tgt_padding_mask, src_padding_mask) optimizer.zero_grad() tgt_out = tgt[1:, :] loss = loss_fn(logits.reshape(-1, logits.shape[-1]), tgt_out.reshape(-1)) loss.backward() optimizer.step() losses += loss.item() torch.save(model, PATH) return losses / len(train_iter) def evaluate(model, val_iter): model.eval() losses = 0 for idx, (src, tgt) in (enumerate(valid_iter)): src = src.to(device) tgt = tgt.to(device) tgt_input = tgt[:-1, :] src_mask, tgt_mask, src_padding_mask, tgt_padding_mask = create_mask(src, tgt_input) logits = model(src, tgt_input, src_mask, tgt_mask, src_padding_mask, tgt_padding_mask, src_padding_mask) tgt_out = tgt[1:, :] loss = loss_fn(logits.reshape(-1, logits.shape[-1]), tgt_out.reshape(-1)) losses += loss.item() return losses / len(val_iter) Now training the model. for epoch in range(1, NUM_EPOCHS+1): start_time = time.time() train_loss = train_epoch(transformer, train_iter, optimizer) end_time = time.time() val_loss = evaluate(transformer, valid_iter) print((f"Epoch: {epoch}, Train loss: {train_loss:.3f}, Val loss: {val_loss:.3f}, " f"Epoch time = {(end_time - start_time):.3f}s"))This model is trained using transformer architecture in such a way that it trains faster and also it converges to a lower validation loss compared to other RNN models.
def greedy_decode(model, src, src_mask, max_len, start_symbol): src = src.to(device) src_mask = src_mask.to(device) memory = model.encode(src, src_mask) ys = torch.ones(1, 1).fill_(start_symbol).type(torch.long).to(device) for i in range(max_len-1): memory = memory.to(device) memory_mask = torch.zeros(ys.shape[0], memory.shape[0]).to(device).type(torch.bool) tgt_mask = (generate_square_subsequent_mask(ys.size(0)) .type(torch.bool)).to(device) out = model.decode(ys, memory, tgt_mask) out = out.transpose(0, 1) prob = model.generator(out[:, -1]) _, next_word = torch.max(prob, dim = 1) next_word = next_word.item() ys = torch.cat([ys, torch.ones(1, 1).type_as(src.data).fill_(next_word)], dim=0) if next_word == EOS_IDX: break return ys def translate(model, src, src_vocab, tgt_vocab, src_tokenizer): model.eval() tokens = [BOS_IDX] + [src_vocab.stoi[tok] for tok in src_tokenizer(src)] + [EOS_IDX] num_tokens = len(tokens) src = (torch.LongTensor(tokens).reshape(num_tokens, 1)) src_mask = (torch.zeros(num_tokens, num_tokens)).type(torch.bool) tgt_tokens = greedy_decode(model, src, src_mask, max_len=num_tokens + 5, start_symbol=BOS_IDX).flatten() Now, let’s test our model on translation. output = translate(transformer, "Eine Gruppe von Menschen steht vor einem Iglu .", de_vocab, en_vocab, de_tokenizer) print(output)Above the red line is the output from the translation model. You can also compare it with google translator.
Source: Google Translator
The above translation and the output from our model matched. The model is not the best but still does the job up to some extent.
ReferenceThank you
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
Related
Update the detailed information about Prominent Applications Of Natural Language Processing In Communication on the Bellydancehcm.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!