Faculty Advisor - Dr. Mayank Singh
In the recent past, we witness massive progress on the development of code generation systems for domain-specific languages (DSLs) employing sequence-to-sequence deep learning techniques. In this project we specifically experiment with AlgoLisp DSL-based generative models and showcase their extreme dataset bias through the different classes of adversarial examples. We also present a simple transformer based encoder-decoder model that outperforms the all of Algo-Lisp DSL-based baselines. However, consistent with the previous baselines, the proposed model achieves poor performance under adversarial settings.
Faculty Advisor - Dr. Udit Bhatiya
Currently, people are using deterministic mathematical models to predict rainfall and for the other related task. Since machine learning have proven to given state-of-the-art results for many different task. My task is to tackle the predictability of monsoon through machine learning algorithms.
Fraud Detection in a Bitcoin network is fairly complicated task because its decentralized nature makes it hard pinpoint the anomalous user and secondly due to lack of labeled data. I used some of the unsupervised learning techniques like k-means, DB-SCAN etc for detecting anomalous transactions and users. The approach was to generate two graphs first was to find anomalous user and the other one for finding anomalous transactions and use dual evaluation as performance metric.
[Report]Operating Systems Course Project
Map Reduce is a programming model suitable for processing huge data parallelly. Works in two phases - Map Phase and Reduce Phase. Input to each phase is key-value pairs. It goes through four phases of execution, namely, splitting, mapping, shuffling, and reducing. Our task was to implement a MapReduce framework which can be used as a general purpose Library in C with several benchmarks to test its performance.
[Slides]Detection of Fake news is a very challenging task, however we focus on a subset of fake news (Political fake news) using LIAR++ Dataset. I used a custom lstm model integrated with pretrained GloVe embedding to processing meta data such as subject, context, POS tag etc. of fake news along with main text to produce final vector with a binary classification accuracy of 62.5%. One key insight that we found was the justification of each claim increases the accuracy of the model.
[Report]Developed a website which displays about events happening in the IITGN campus. The events displayed will depend on the location of the display device e.g, Academic Area, Hostel Area etc. Devised an algorithm to decide the order, frequency and time each event will be displayed for according to locations. Designed a website with Django backend to render the events fetched with the help of Airtable API on to pre-defined templates.
7thInter IIT Tech Meet, IIT Bombay
We used a modified version of FCN (Fully Convolutional Network) for the semantic segmentation of satellite images. Satellite images contain more than three channel and hence provide more information, we used different preprocessing techniques like edge and texture detection algorithms and created input image of 5 channels. Finally, we used ensembling of both supervised as well as unsupervised techniques to achieve an accuracy of 92.3% on the hold out testset.
[Report] [Slides] [Github]Digital System Course Project
The DFT is a linear transformation of a vector Xn(time-domain) to the vector Xm(coefficients of the components sinusoids to the time domain signal). DFT requires O(N2) while FFT uses the precomputed results and require O(Nlog(N)) time. Implemented 8 point Radix 2 FFT having Bit Reveral module, Butter Fly unit, Twiddle Factor Generator and Control Module.
[Poster] [Github]PROBOT is a custom chatbot for the IITGN community which gives the details about the professor and their research interest. The model was trained on google news which contains approx three billons english words. Word2Vec was used to convert the words to vectors and cosine similarity was to used to predict most likely output. We used pyttsx3 module for text to speech conversion. We used Django framework to link backend and frontend.
[Github]