फेक न्यूज बद्दल आपल्यात बऱ्यापैकी जागृती येतीये आणि हि खूप चांगली गोष्ट आहे. याच अनुषंगाने फेक न्यूज नेमकी काय असते त्याचा अभ्यास आणि ती ऑटोमॅटिकली ओळखता येते का या बद्दल माझ्या एम टेक मध्ये मी यावर थोडं काम केलं होतं. बऱ्याच वेळा ते काम ओपनसोर्स करावं असा विचार ही केला, पण वेळेच्या अभावाने त्याला शेअर करण्यासारखं स्वरूप अजून देता आलं नाही. म्हणजे लोकांना काही वापरण्यासाठी द्यायचं असेल तर ते त्यांच्या वापरण्यासारखे असावे म्हणजे दुसऱ्या डेव्हलोपर्सना ते नीट समजावं आणि जमल्यास त्यात भर घालता यावी. पण अजून तरी जमलेलं नाही.
असो.
खाली त्याच्या रेपोर्टचा काही भाग देतोय. कदाचित कुणाला उपयोगी पडेल.
ABSTRACT :
News articles that are intentionally and verifiably false, and could mislead readers are called as fake news. These news articles are mainly used to spread hatred and manipulate public opinions among other serious purposes. These are mostly spread through digital media platforms like facebook, whatsapp and only online news sites. Avoiding fake news or making it easier to identify one is an important task for many digital platforms and even the whole society. Currently, the widely used methods to control the fake news are manual. Many platforms allow users to report such news so that the platforms can manually verify them. Some platforms pro-actively monitor the news and control the same. Given the volume and variety of the content and users, it’s a very challenging task.
There are attempts at making the process automatic using various natural language processing and machine learning techniques. Methods of stance detection and text classification have been applied. These methods have achieved the considerable ‘contest’ accuracy. But, fake news is a variable entity and plain text classification methods based on historical data are not sufficient in the real world.
In this work, an attempt to identify fake news using unsupervised and supervised methods is made. The devised system tries to verify the candidate fake news using other news sources over the web. It also makes use of text classification based on historical data to identify the fake news. The report details the explorations done to devise the system and reports some of the evaluations carried out.
1.1 Overview
“To be news articles that are intentionally and verifiably false, and could mislead readers”
It is one of the a definitions of fake news. A recent study [1] used this to define Fake News. It further says that, the main motivations behind the production of fake news are financial and ideological. Both the aspects, financial and ideological are an important aspects of the human society. With fake news there is an effort to alter and manipulate those and that is fundamentally wrong and unethical.
Fake news is in existence for long time and many evidences can be referred from the history. What has changed is the speed with which the fake news travels. With the rapid developments in the communication and computing technologies, the spread of fake news is just click/swipe away. And with the same speed it affects the normal course of life of a society.
Corporations and governments are trying hard to minimize the fake news and its repercussions. Technological interventions can be devised to tackle it to some extent. We can explore technological solutions and their applications to deal with the fake news and contribute to the efforts of minimizing it and its repercussions.
The repercussions of the fake news are so serious that it changes the course of progress of the society. Recently it has shown its impacts in three important events in the history. One, the American Presidential election, two the Brexit and three, the last Indian election and post election events.
Various studies have been conducted to asses impact of the fake news on various events. Authors of [2] from MIT studies fake news and its spread in comparison to the real news. They found out astonishing facts about fake news one of which is reproduced below for the ready reference.
Falsehood diffused significantly farther, faster, deeper, and more broadly than the truth in all categories. The effects were most pronounced for false political news than for news about terrorism, natural disasters, science, urban legends, or financial information. Controlling for many factors, false news was 70% more likely to be retweeted than the truth.
The finding is surprising as our, or least mine, assumption about the collective public opinion is that collectively humans make right decisions. This assumption is also supported by the fact that Wikipedia has been equally ‘good and erroneous’ source of information as Encyclopedia [3]. Similarly, it’s also evident from the success of the crowdsourcing. But when it comes to news it’s a different fact. The fake I.e false is upheld taller than the real. This contradiction can be subject of another study. There must be factors affecting these two different case differently.
The fake news is different from a real typical news in following special ways:
Fake news is made up of content that gives rise to feelings like fear, disgust and surprise in the minds of readers, whereas real news gives rise to feelings like joy, sadness and trust.
Fake news has shocking claim in the headline and sounds unbelievable
It's mostly published on not-so reliable source or circulated on social media sites
It's not professionally written, there may be spelling errors and bad formatting
This work aimed to contribute to the task of controlling fake news through application of text processing to the news articles. Following sections will help readers understand the technical definition of the fake news and use of various approaches to build the fake news identification system. The report will also talk about the build system, it’s advantages, limitations and future scope.
For the complete report -
https://github.com/pbpimpale/fakenews/blob/master/FakeNews%20-%20Copy%20for%20Sharing.pdf