Anirban Sen


Department of Computer Science
Email: anirban@cse.iitd.ac.in
Linkedin handle: Anirban@Linkedin
Phone: +91 78 3816 0298 (Mobile), +91 11 2659 7300 (Office)
Office: Room 114, ICTD Lab, SIT Building

Research

I am a research scholar at IIT Delhi, working with Dr. Aaditeshwar Seth in the ACT4D team. I joined IIT Delhi in 2014 after my masters in computer science from IIEST Kolkata, under the guidance of Dr. Saptarshi Ghosh. The broad area of my research is computational social science, wherein I analyze publicly available web data on a large scale, and build tools to better inform citizens about some of the facets of political economy. Following are some of the key problems that I try to solve through my research: The area of my research is Computational Social Science, and I am interested in applying big data analysis, data science, and AI based solutions to solve social problems.

Current Work

There are several factors that can adversely impact development through incorrect or inappropriate policy formulation. These factors often arise due to the lack of proper functioning of the participants in the democratic process, e.g., the mass media, the citizens, and the Parliament. For instance, a biased media leads to a polarized public opinion, resulting in elections not being contested on well informed policy grounds. Similarly, incorrect information propagating through information channels like mass media and social media can also lead to the citizens being misinformed. This results in undeserving candidates securing key positions, often leading to improper policy formulation and implementation. Corruption between the corporate and government entities can lead to corporates influencing policy, to skew it in their favor. Lack of nuanced understanding of the policy requirements can also lead to superficial parliamentary discourse on the citizens' concerns. My research touches upon each of these factors, by analyzing data collected from publicly available web based sources.

For instance, I study large scale news data collected from prominent online newspapers, to understand if they are biased in terms of topics covered on the policy events, the dominant frames of presenting these topics, and the coverage they provide to various entities (politicians, bureaucrats, judiciary members) and political parties. I also study social media data to find out if the followers of these news-sources amplify or counter these biases on Twitter. To study the nature of policy discourse, I study statements made by influential entities (the power elites) in mass media, and the questions asked by the policymakers in the Parliament, using parliamentary Question-Hour data. With the help of my team, I have developed a corporate-government knowledge base for India, which provides information on corporate-government interlocks, for the purpose of observing the collusion between corporate and government entities. I have reported most of my findings in a website titled the Giant Economy Monitor. The purpose of this site is to bring out the results of my research in the public domain. I plan to update it with new data (e.g., new topics, new content on topics under consideration, new policy events, and further analysis of this data) so that it can present a time-evolving and current picture of the political economy.

Additionally, to counter algorithmic biases existing in news recommendation systems, I have proposed a novel news recommendation algorithm that ensures fairness and diversity in news recommendation, considering a temporally evolving news-feed. The target of this tool is to provide an equitable coverage to all aspects or topics of discussion of a policy, while also bringing diversity to aspect recommendation, thereby providing people with a more complete picture of any policy event. Second, to provide people authentic information about these events, I am planning to build tools that counter misinformation spread on online media sources. My research and future plans is further elaborated in my research statement. My PhD thesis can be found here.

Previous Work

  1. Predicting Virality of Tweets in Online Social Networks: In Twitter, there are many topics (hashtags) which become viral with time, and are retweeted about thousands of times. There are some other topics, which die a natural death within days, hours, or even minutes. This work concerns identifying some structural features that include tweet features, user network based features, conductance based features, and users' geographical features to build a classifier that can predict if a topic will eventually become viral, by just looking at the initial few tweets on the hashtag. We have used a Random Forest based classifier, which with the help of these features solves the task with significantly high precision.

  2. Extracting Situational Updates from Microblogs during Calamities [masters thesis]: Twitter gets flooded with tweets immediately after a disaster or calamity. Some of these tweets are purely emotional, while some provide actionable, factual information that can help in knowing important details on an event. The latter are known as Situational Updates. My work in this project involved analysing tweets corresponding to four disasters namely Uttarakhand Flood (India), Hyderabad Blasts (India), Sandyhook School Shooting (US), and Typhoon Bopha (Phillipines), and extracting a set of tweet and user based features to build a classifier, which would effectively identify situational updates. We have used support vector machine (SVM) for this purpose, and have come up with an interesting set of features, which work with significant accuracy.

  3. Automated Diagram Generation from School Level Geometry Problems [B.Tech thesis]: In this project, we used natural language processing techniques to build a system, which parses school level geometry problems in English, and generates an intermediate, platform-independent graphical representation. From this representation, information can be extracted into a form which is compatible with any general graphical software (like OpenGL) that can be used to create a diagram in support of the problem statement. We tested our system on a set of around 500 geometry problems of varied complexity, and found it to perform with significant precision.

Publications

  1. Ideology Detection in the Indian Mass Media
    Ankur Sharma (IIT Delhi), Navreet Kaur (IIT Delhi), Anirban Sen (IIT Delhi), and Dr. Aaditeshwar Seth (IIT Delhi)
    [recently accepted in ASONAM 2020]

  2. What Drives Location Preference for Corporate Social Responsibility (CSR) Investments in India?
    Varun Pareek (IIT Delhi), Rohit Sharma (IIT Delhi), Anirban Sen (IIT Delhi), Manikaran Kathuria (IIT Delhi), Arundeep Gupta (IIT Delhi), and Dr. Aaditeshwar Seth (IIT Delhi)
    ACM COMPASS 2020 [online conference]

  3. Studying the discourse on economic policies in India using mass media, social media, and the parliamentary question hour data
    Anirban Sen (IIT Delhi), Saloni Bhogale (Ashoka University), Dr. Priyamvada Trivedi (Ashoka University), Dr. Aaditeshwar Seth (IIT Delhi), and the ACT4D team (IIT Delhi)
    ACM COMPASS 2019, Accra, Ghana

  4. An Attempt at Using Mass Media Data to Analyze the Political Economy Around Some Key ICTD Policies in India
    Anirban Sen (IIT Delhi), Dr. Aaditeshwar Seth (IIT Delhi), and the ACT4D team (IIT Delhi)
    ICTDX, Ahmedabad, Gujarat, India

  5. Empirical Analysis of the Presence of Power Elite in Media
    Anirban Sen (IIT Delhi), Dr. Aaditeshwar Seth (IIT Delhi), and the ACT4D team (IIT Delhi)
    ACM COMPASS 2018, Facebook, Menlo Park, CA, USA

  6. Leveraging Web Data to Monitor Changes in Corporate-Government Interlocks in India
    Anirban Sen (IIT Delhi), Dr. Aaditeshwar Seth (IIT Delhi), and ICTD team (IIT Delhi)
    ACM COMPASS 2018, Facebook, Menlo Park, CA, USA

  7. Improving Similar Question Retrieval using a Novel Tripartite Neural Network based Approach
    Anirban Sen (IIT Delhi), Manjira Sinha (XRCI), and Sandya Mannaswamy (XRCI)
    Forum for Information Retrieval Evaluation (FIRE-2017), Indian Institute of Science, Bangalore

  8. Stance Classification of Multi-Perspective Consumer Health Information
    Anirban Sen (IIT Delhi), Manjira Sinha (XRCI), Sandya Mannaswamy (XRCI), and Shourya Roy (XRCI)
    ACM CoDS-COMAD 2018, Goa University, Goa

  9. Fine-grained Emotion Detection in Contact Center Chat Utterances
    Shreshtha Mundra (XRCI), Anirban Sen (IIT Delhi), Sandya Mannaswamy (XRCI), Manjira Sinha (XRCI), and Shourya Roy (XRCI)
    21st Pacific Asia Conference on Knowledge Discovery and Data Mining 2017 (PAKDD 2017). Jeju Island, South Korea

  10. Multi-task Representation Learning for Enhanced Emotion Categorization in Short Text
    Anirban Sen (IIT Delhi), Sandya Mannaswamy (XRCI), Manjira Sinha (XRCI), and Shourya Roy (XRCI)
    21st Pacific Asia Conference on Knowledge Discovery and Data Mining 2017 (PAKDD 2017). Jeju Island, South Korea

  11. Embedding Learning of Figurative Phrases for Emotion Classification in Micro-Blog Texts
    Shreshtha Mundra (XRCI), Manjira Sinha (XRCI), Sandya Mannaswamy (XRCI), Anirban Sen (IIT Delhi), and Shourya Roy (XRCI)
    The Fourth ACM IKDD Conferences on Data Sciences (CODS 2017)

  12. On the role of conductance, geography and topology in predicting hashtag virality
    Siddharth Bora, Harvineet Singh, Anirban Sen, Amitabha Bagchi, Parag Singla
    Social Network Analysis and Mining, December 2015

  13. Extracting Situational Awareness from Microblogs during Disaster Events
    Anirban Sen, Koustav Rudra, Saptarshi Ghosh
    Social Networking Workshop, COMSNETS 2015, 7th International Conference on Communication Systems & Networks

  14. Text-to-Diagram Conversion: A Method for Formal Representation of Natural Language Geometry Problems
    Anirban Mukherjee, Sarbartha Sengupta, Dipanjan Chakraborty, Anirban Sen, and Utpal Garain
    IASTED International Conference on Artificial Intelligence and Applications (AIA 2013). Innsbruck, Austria.

Patents [Innovation Document (ID) registered as chief inventor]

  • Stance Detection in Healthcare Related Query Responses (Conduent Labs India, earlier known as Xerox Research Center India)
  • Fine-grained Emotion Detection in Contact Center Chat Utterances (Conduent Labs India, earlier known as Xerox Research Center India)

Work Experience

  • Selected for internship at Xerox Research Centre India in the Text and Graph Analytics group (June - December, 2016)
  • Teaching Assistant at IIT Delhi for courses Introduction to Computing, Database Management Systems, Computer Networks, and Data Structures
  • Worked at Tata Consultancy Services from December,2010 to July,2012 as Technical Editor.

Scholarships and Grants

  • TCS Innovation Labs scholarship for PhD Scholars (July 2014)
  • Microsoft Research travel grant (May 2017)
  • ACM IARCS travel grant (May 2017)
  • Microsoft Research travel grant (May 2018)

Education

  • B.Tech (Undergraduate) in Information Technology: RCC Institute of Information Technology, Kolkata (affiliated to West Bengal University of Technology)
  • M.E. (Postgraduate) in Computer Science: Indian Institute of Engineering Science and Technology, West Bengal
  • Ph.D. (Research) in Computer Science: IIT Delhi (currently pursuing)

Some of the courses taken in IIEST (Postgraduation)

  • Principles of Programming Language
  • Programming Logic and Artificial Intelligence
  • Information Theory
  • Fuzzy Logic and Its Applications
  • Algorithms and Data Structures
  • Image Processing

Courses taken in IIT Delhi

  • Analysis of Media (paper reading course)
  • Machine Learning
  • Advanced Data Structures
  • Advanced Data Management
  • Virtualization and Cloud
  • Numerical Algorithms
  • Communication Skills

Other Interests

  • Reading: Some of my favorite authors (fiction) are Haruki Murakami, Agatha Christie, Dan Brown, Bibhutibhushan Bandopadhyay, and Narayan Gangopadhyay. I also like to read mangas (Itagaki Keisuke's Baki series being my favorite till date).
  • Movies and series: I prefer watching movies and web series on history, drama, crime, horror, and science. Favorite actors: Tulsi Chakraborty, Utpal Dutt, Rabi Ghosh, Robert De Niro, Al Pacino, Nandita Das, Nawazuddin Siddiqui, Irfan Khan, Tom Hanks, and many more
  • Sports: I am interested in collecting information on various forms of Martial Arts. I regularly follow UFC and bodybuilding.
  • Travel: I am an avid traveller. Most of my best travel experiences are from India, when I am accompanied by my partner Indrabati and my backpack.