Current Projects
Supporting unstructured querying on graphs
This project broadly focuses on supporting unstructured (keyword) queries on graph structured data like RDF or XML. Currently, I am working on the post processing step of ranking the query results and improving user experience by reducing redundant information.
Exploratory Top-k query processing
This project aims at providing exploratory search on Knowledge Graphs (KG). In KGs, the user often ends up querying iteratively before the information need is satisfied. Given a query, we could apply multiple relaxations (rephrases) applied to it to get the results sought which could otherwise not appear using the exact query. This gives us a large search space for answers. Our work involves optimizing the top-k query processing in a multiple relaxations scenario with probabilistic guarantees.
Past Projects
Polarizer -- University Hack Day Project [August 2013]
Polarizer is a sentiment analysis engine developed as a part of HackU, Yahoo! annual University Hack Day. We had implemented peer-to-peer insult filtering and pro-con classification of user comments on online public debates. We also added the feature of ranking the comments and generating location based demographic sentiment heatmap using heatmap.js. Polarizer won first prize competing with 40 other teams.
Ontology Extraction by focused retrieval using the web as an Oracle -- M.Tech Thesis [Jan 2012 - May 2012]
This project involved building up of a domain ontology using Latent Semantic Indexing (LSI) and theme extraction techniques. The LSI discovers the concept and the term nodes. The theme extraction and clustering techniques have been used to extract relations. The final ontology is in the form of a bipartite graph.
Focused Web information retrieval for building a contextual knowledge base using statistical learning techniques -- Research Project [Jan 2011 - Dec 2011]
In this project, statistical learning techniques like incremental clustering (for document and word) and classification (for document and word) were used to retrieve web information for building a contextual knowledge base. The learning ensures that the web pages retrieved are relevant to the current focus which is defined by the seed words and seed sites.