Full Stack Data Scientist
Kaggle 2x Expert
AI • ML • NLP • CV
Hello, I’m Rohit Swami!
A Data Science Practitioner who loves to uncover the hidden
facts and meaningful insights from messy data. I'm a Data Scientist
Nanodegree graduate from Udacity where I learned building effective
Machine Learning Model, running Data Pipelines, Natural Language
Processing, Image Processing, building Recommendation Systems, and
deploying solutions to the cloud.
I am working towards a role where I can engage with a group of
like-minded people who dream to bring revolution in the world with data.
• Implemented an end-to-end auto-suggestion tool to find the semantic
similarity between the solution of student’s queries and deployed the model
as a REST-API on AWS EC2 using Flask backend. The proposed system
drastically reduced the query resolution
time of 72% of students from 2 days to 13.3 minutes
(average).
Developed a dashboard to analyse the Key Performance Indicator (KPI) using SQL queries.
• Built an NLP tool to calculate the percentage of code
and text in TA’s responses to
increase the overall interaction of students and TAs.
• Wrote scripts to send emails and migrate 1500+ students from
Flock to open-source platform Mattermost.
• Revamped InterviewBit’s webpages and worked on content creation of Data
Science, Machine Learning and Deep Learning.
• Fostered 200+ students to improve their skills by providing clear, positive and line-by-line actionable feedback on their submitted projects using upGrad's code review tool for data science courses.
• Analyzed sentiments of public on different government policies and
campaigns on social media.
• Exported over 4000 tweets with Twitter API and built a
hybrid solution to classify each tweet as positive or negative
with KNN algorithm.
• Performed data mining operations on websites for various internal
purposes.
• MSTC host community events to guide professionals in different
technologies. We strive to create a platform where like-minded individuals
come together to share and learn about technology.
• As a lead speaker, I share my knowledge about front-end technologies.
International Journal of Emerging Technologies and Innovative Research (www.jetir.org), 5(12), 598-605
ISSN: 2349-5162
Think India Journal, 22(3), 8382-8391
ISSN: 0971-1260
ACM - International Conference Proceedings Series (Jun, 2019)
DOI: 10.1145/3339311.3339356
Problem Statement: Build an algorithm to
best identify potential donors.
• My goal was to evaluate and optimize several different
supervised learners to determine which algorithm will
provide the highest donation yield while also reducing the
total number of letters being sent.
• Implemented a pipeline in Python that will train and
predict on the supervised learning algorithm given. I used
the Grid Search method to tune the parameters of all
algorithms and Gradient Boosting Classifier to extract
features importance.
Problem Statement: Build an app to
prediction indian paper currency.
• Collected the images from search engines and trained a
Convolutional Neural Network (CNN) model to predict
the 7 types of Indian paper currency i.e. 10, 20, 50,
100, 200, 500, 2000.
• Deployed the machine learning model on Heroku with
Flask back-end and secured the app with CSRF
protection.
Problem Statement: Building a command-line
application to predict flower class along with the
probability.
• AI image classification and machine learning utilizing the
PyTorch framework.
• Used transfer learning on pre-trained architectures
including vgg11, vgg13, vgg16, vgg19, densenet121,
densenet169, densenet161, and densenet201.
• Trained dynamic neural networks in Python with GPU
acceleration with 85% accuracy.
Problem Statement: Customer Churn Prediction
from a Music App Spark
• Used PySpark to analyze the data of a fictional music app
Sparkify to identify the factor affecting the customers who
are most likely to churn.
• Trained machine learning model on IBM Cloud with the
accuracy of 83.87%
Problem Statement: Analyzing the public data of StackOverflow Survey 2017
• Analyzed the StackOverflow public data of over 64,000
developers around the globe for the year 2017.
• Answered questions like, how education may influence the
salary, gender ratio of developers across the world, the
rate of increase in salary with years of experience and does
more language implies more salary hike in IT sector.
Problem Statement: Identify Customer
Segments with Arvato Dataset
• The data and design for this project were provided by
Arvato Financial Services. I applied unsupervised learning
techniques on demographic and spending data for a sample of
German households.
• Preprocessed the data, apply dimensionality reduction
techniques, and implement clustering algorithms to segment
customers with the goal of optimizing customer outreach for
a mail order company.
• The objective was to find relationships between
demographics features, organize the population into
clusters, and see how prevalent customers are in each of the
segments obtained.
Problem Statement: Building a machine
learning pipeline
• Analyzed disaster data from Figure Eight to build a model
for an API that classifies disaster messages.
• Created a machine learning pipeline to categorize the
events so that it can send the messages to an appropriate
disaster relief agency and deployed as a web app.
Problem Statement: Get real-time updates of
cricket matches on your desktop
• Get real-time push notifications on your desktop on every
Four, Six and fall of a wicket of Indian Premier League's
matches
• Used an HTTP persistent-connection to extract the live
score from a webpage
Problem Statement: Get the name of the
continent in which a country is located in.
• Built a pretty simple Python package to understand how
packages work on PyPi.
A Project made under Microsoft Technical Community LPU
• The project centers around developing an URL shortening
service. You have got long URLs that are hard to remember.
Use
shortTo.com to shorten your URL and give them an easy to
remember
URL.
• Hosted on AWS, we used Flask in the backend. The
bootstrap
framework and javascript with media queries are used to make
the UX/UI even more responsive and interactive.
Twitter, one of the largest social media site receives tweets in millions every day. This huge amount of raw data can be used for industrial or business purpose by organizing according to our requirement and processing. This project provides a way of sentiment analysis using hadoop which will process the huge amount of data on a hadoop cluster faster in real time
• Hacktoberfest is a month-long celebration of open source
software
in partnership with Github, in which participants need to
make
4 Pull Request across the Github.
• Hacktoberfest Status Checker is an open-source tool
to know
the status of your Hacktoberfest activities in the month of
October.
Sales data was given in CSV format and the task of this project was to derive valuable insights from the raw data, like:
• Which product was most sold?
• What payment modes were used for purchasing products?
• What is the most common payment method for the United
States?
• What was the earliest time of the day a transaction
occurred?
• Were there repeat customers? Discuss possible issues.
• Smart Q Labs is a dynamic queue management solution which
take
care of your queue number and gives you notification time by
time. Not only that we provide analytics for the outlets so
that
they can manage as well as enjoy managing queue.
• I designed the mobile website using the concept of
PWA (Progressive
Web Apps), specially designed to work in the offline mode or
bad network connection. PWA uses modern web capabilities to
deliver
an app-like experience to users. It uses the app-shell model
to provide app-style navigation and interactions.
The dataset contains information of people die from "Diabetes Mellitus" between 1999-2015. The task of this project was to derive valuable insights from the raw data, like:
• In what state has the most deaths occurred?
• Over what period (start and end) was the data
collected?
• What were the total number of deaths for 2006?
• What state had the least deaths in 2001?
• How many people die from "Diabetes mellitus" over the
entire
reporting period?
Being a teammate he is also a good person. A very humble guy always ready to help who approaches him. Have a good habit of learning and adopting new technologies. We worked together on few projects and he turned out to be the best resource for the team.
This man is amazing. We had no clue of analytics. He took his time and explained sent videos. He communicated with us through each stage he completes. The reports he provided allowed us to make changes which will benefit our business greatly. We will do more projects with him in the very near future
Great developer to work with; easy going, honest and reliable. He understood what I wanted; answered my questions and also completed the project before the deadline, which was a major bonus!
My goal was to deliver a high grade quality website. I want my customers to feel at ease as he/she navigate and attain the information they need to make an inform decision to buy the product. What I enjoy about this gig, was how expedient he was in delivering a high grade tailor made website in a matter of a few days. Great work.
Very good guy, helped solve my problem very quickly and on time. He was prompt and delivered as he said he would, great experience working with him.