Hello, I’m Rohit Swami!
I'm a Data Science Practitioner who loves to uncover the hidden facts and meaningful insights from messy data. I'm a Data Scientist Nanodegree graduate from Udacity where I learned building effective Machine Learning Model, running Data Pipelines, Natural Language Processing, Image Processing, building Recommendation Systems, and deploying solutions to the cloud.
Last year, I worked at MyGov India as a Data Analyst Intern, where my primary task was to analyze the different governmental policies and campaigns on social media.
Personally, I believe in the thought that “Time is the gift that most of us take for granted”. I love to keep myself busy in gaining new skills and doing freelancing projects on the weekends.
I am working towards a role where I can engage with a group of like-minded people who dream to bring revolution in the world with data.
• Built a suggestion system (NLP based) for Help Requests in Python and deployed the model as an API on AWS EC2 using a WSGI Server. The proposed system drastically reduced the query resolution time of 72% of students from 2 days to 13.3 minutes (average).
• Developed an NLP tool to calculate the percentage of code and text in a sentence to avoid the plagiarism in TA’s responses and eventually increase the overall interaction of students and Teaching Assistants.
• Wrote scripts to send emails and migrate 1500+ students from Flockto open-source platform Mattermost. • Revamped InterviewBit’s webpages and worked on content creation of Data Science, Machine Learning and Deep Learning.
• Fostered 200+ students to improve their skills by providing clear, positive and line-by-line actionable feedback on their submitted projects using upGrad's code review tool for data science courses.
• Analyzed sentiments of public on different government policies and campaigns on social media.
• Exported over 4000 tweets with Twitter API and built a hybrid solution to classify each tweet as positive or negative with KNN algorithm.
• Performed data mining operations on websites for various internal purposes.
• MSTC host community events to guide professionals in different technologies. We strive to create a platform where like-minded individuals come together to share and learn about technology.
• As a lead speaker, I share my knowledge about front-end technologies.
International Journal of Emerging Technologies and Innovative Research (www.jetir.org), 5(12), 598-605
Think India Journal, 22(3), 8382-8391
ACM - International Conference Proceedings Series (Jun, 2019)
Problem Statement: Build an algorithm to best identify potential donors.
• My goal was to evaluate and optimize several different supervised learners to determine which algorithm will provide the highest donation yield while also reducing the total number of letters being sent.
• Implemented a pipeline in Python that will train and predict on the supervised learning algorithm given. I used the Grid Search method to tune the parameters of all algorithms and Gradient Boosting Classifier to extract features importance.
Problem Statement: Build an app to prediction indian paper currency.
• Collected the images from search engines and trained a Convolutional Neural Network (CNN) model to predict the 7 types of Indian paper currency i.e. 10, 20, 50, 100, 200, 500, 2000.
• Deployed the machine learning model on Heroku with Flask back-end and secured the app with CSRF protection.
Problem Statement: Building a command-line application to predict flower class along with the probability.
• AI image classification and machine learning utilizing the PyTorch framework.
• Used transfer learning on pre-trained architectures including vgg11, vgg13, vgg16, vgg19, densenet121, densenet169, densenet161, and densenet201.
• Trained dynamic neural networks in Python with GPU acceleration with 85% accuracy.
Problem Statement: Customer Churn Prediction from a Music App Spark
• Used PySpark to analyze the data of a fictional music app Sparkify to identify the factor affecting the customers who are most likely to churn.
• Trained machine learning model on IBM Cloud with the accuracy of 83.87%
Problem Statement: Analyzing the public data of StackOverflow Survey 2017
• Analyzed the StackOverflow public data of over 64,000 developers around the globe for the year 2017.
• Answered questions like, how education may influence the salary, gender ratio of developers across the world, the rate of increase in salary with years of experience and does more language implies more salary hike in IT sector.
Problem Statement: Identify Customer Segments with Arvato Dataset
• The data and design for this project were provided by Arvato Financial Services. I applied unsupervised learning techniques on demographic and spending data for a sample of German households.
• Preprocessed the data, apply dimensionality reduction techniques, and implement clustering algorithms to segment customers with the goal of optimizing customer outreach for a mail order company.
• The objective was to find relationships between demographics features, organize the population into clusters, and see how prevalent customers are in each of the segments obtained.
Problem Statement: Building a machine learning pipeline
• Analyzed disaster data from Figure Eight to build a model for an API that classifies disaster messages.
• Created a machine learning pipeline to categorize the events so that it can send the messages to an appropriate disaster relief agency and deployed as a web app.
Problem Statement: Get real-time updates of cricket matches on your desktop
• Get real-time push notifications on your desktop on every Four, Six and fall of a wicket of Indian Premier League's matches
• Used an HTTP persistent-connection to extract the live score from a webpage
Problem Statement: Get the name of the continent in which a country is located in.
• Built a pretty simple Python package to understand how packages work on PyPi.
A Project made under Microsoft Technical Community LPU
• The project centers around developing an URL shortening service. You have got long URLs that are hard to remember. Use
shortTo.com to shorten your URL and give them an easy to remember
Twitter, one of the largest social media site receives tweets in millions every day. This huge amount of raw data can be used for industrial or business purpose by organizing according to our requirement and processing. This project provides a way of sentiment analysis using hadoop which will process the huge amount of data on a hadoop cluster faster in real time
• Hacktoberfest is a month-long celebration of open source software
in partnership with Github, in which participants need to make
4 Pull Request across the Github.
• Hacktoberfest Status Checker is an open-source tool to know the status of your Hacktoberfest activities in the month of October.
Sales data was given in CSV format and the task of this project was to derive valuable insights from the raw data, like:
• Which product was most sold?
• What payment modes were used for purchasing products?
• What is the most common payment method for the United States?
• What was the earliest time of the day a transaction occurred?
• Were there repeat customers? Discuss possible issues.
• Smart Q Labs is a dynamic queue management solution which take
care of your queue number and gives you notification time by
time. Not only that we provide analytics for the outlets so that
they can manage as well as enjoy managing queue.
• I designed the mobile website using the concept of PWA (Progressive Web Apps), specially designed to work in the offline mode or bad network connection. PWA uses modern web capabilities to deliver an app-like experience to users. It uses the app-shell model to provide app-style navigation and interactions.
The dataset contains information of people die from "Diabetes Mellitus" between 1999-2015. The task of this project was to derive valuable insights from the raw data, like:
• In what state has the most deaths occurred?
• Over what period (start and end) was the data collected?
• What were the total number of deaths for 2006?
• What state had the least deaths in 2001?
• How many people die from "Diabetes mellitus" over the entire reporting period?
Being a teammate he is also a good person. A very humble guy always ready to help who approaches him. Have a good habit of learning and adopting new technologies. We worked together on few projects and he turned out to be the best resource for the team.
This man is amazing. We had no clue of analytics. He took his time and explained sent videos. He communicated with us through each stage he completes. The reports he provided allowed us to make changes which will benefit our business greatly. We will do more projects with him in the very near future
Great developer to work with; easy going, honest and reliable. He understood what I wanted; answered my questions and also completed the project before the deadline, which was a major bonus!
My goal was to deliver a high grade quality website. I want my customers to feel at ease as he/she navigate and attain the information they need to make an inform decision to buy the product. What I enjoy about this gig, was how expedient he was in delivering a high grade tailor made website in a matter of a few days. Great work.
Very good guy, helped solve my problem very quickly and on time. He was prompt and delivered as he said he would, great experience working with him.