
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
ankittaxak5717.github.io
September 27, 2018 – September 27, 2018
ankittaxak5717.github.io — GitHub repository
View ProjectCrimedetection
August 21, 2018 – August 21, 2018
It is a multi-label classification problem Multi-label Classification: Multilabel classification assigns to each sample a set of target labels. This can be thought as predicting properties of a data-point that are not mutually exclusive, such as topics that are relevant for a document.There are crime charges(labels) for every article description in the dataset. Our tasks is find crime charges for future descriptions. Credit: http://scikit-learn.org/stable/modules/multiclass.html
View ProjectStackoverflowtagpredictor
August 21, 2018 – August 21, 2018
Description Stack Overflow is the largest, most trusted online community for developers to learn, share their programming knowledge, and build their careers. Stack Overflow is something which every programmer use one way or another. Each month, over 50 million developers come to Stack Overflow to learn, share their knowledge, and build their careers. It features questions and answers on a wide range of topics in computer programming. The website serves as a platform for users to ask and answer questions, and, through membership and active participation, to vote questions and answers up or down and edit questions and answers in a fashion similar to a wiki or Digg. As of April 2014 Stack Overflow has over 4,000,000 registered users, and it exceeded 10,000,000 questions in late August 2015. Based on the type of tags assigned to questions, the top eight most discussed topics on the site are: Java, JavaScript, C#, PHP, Android, jQuery, Python and HTML.
View ProjectNetflixmoviereviews
August 21, 2018 – August 21, 2018
Netflix is all about connecting people to the movies they love. To help customers find those movies, they developed world-class movie recommendation system: CinematchSM. Its job is to predict whether someone will enjoy a movie based on how much they liked or disliked other movies. Netflix use those predictions to make personal movie recommendations based on each customer’s unique tastes. And while Cinematch is doing pretty well, it can always be made better. Now there are a lot of interesting alternative approaches to how Cinematch works that netflix haven’t tried. Some are described in the literature, some aren’t. We’re curious whether any of these can beat Cinematch by making better predictions. Because, frankly, if there is a much better approach it could make a big difference to our customers and our business. Credits: https://www.netflixprize.com/rules.html
View ProjectQuoraQuestionPairSimilarity
July 22, 2018 – July 22, 2018
Quora Question Pairs 1. Business Problem 1.1 Description Quora is a place to gain and share knowledge—about anything. It’s a platform to ask questions and connect with people who contribute unique insights and quality answers. This empowers people to learn from each other and to better understand the world. Over 100 million people visit Quora every month, so it's no surprise that many people ask similarly worded questions. Multiple questions with the same intent can cause seekers to spend more time finding the best answer to their question, and make writers feel they need to answer multiple versions of the same question. Quora values canonical questions because they provide a better experience to active seekers and writers, and offer more value to both of these groups in the long term. Credits: Kaggle Problem Statement Identify which questions asked on Quora are duplicates of questions that have already been asked. This could be useful to instantly provide answers to questions that hav
View ProjectTaxiPickupsPredictions
July 22, 2018 – July 22, 2018
Taxi Demand Prediction in New York city using Time series data
View ProjectMalwareDetection
July 22, 2018 – July 22, 2018
Microsoft Malware detection 1.Business/Real-world Problem 1.1. What is Malware? The term malware is a contraction of malicious software. Put simply, malware is any piece of software that was written with the intent of doing harm to data, devices or to people. Source: https://www.avg.com/en/signal/what-is-malware 1.2. Problem Statement In the past few years, the malware industry has grown very rapidly that, the syndicates invest heavily in technologies to evade traditional protection, forcing the anti-malware groups/communities to build more robust softwares to detect and terminate these attacks. The major part of protecting a computer system from a malware attack is to identify whether a given piece of file/software is a malware. 1.3 Source/Useful Links Microsoft has been very active in building anti-malware products over the years and it runs it’s anti-malware utilities over 150 million computers around the world. This generates tens of millions of daily data points to be analyzed as
View ProjectCancerdaignosis
July 10, 2018 – July 10, 2018
Personalized cancer diagnosis 1. Business Problem 1.1. Description Source: https://www.kaggle.com/c/msk-redefining-cancer-treatment/ Data: Memorial Sloan Kettering Cancer Center (MSKCC) Download training_variants.zip and training_text.zip from Kaggle. Context: Source: https://www.kaggle.com/c/msk-redefining-cancer-treatment/discussion/35336#198462 Problem statement : Classify the given genetic variations/mutations based on evidence from text-based clinical literature. 1.2. Source/Useful Links https://www.forbes.com/sites/matthewherper/2017/06/03/a-new-cancer-drug-helped-almost-everyone-who-took-it-almost-heres-what-it-teaches-us/#2a44ee2f6b25 https://www.youtube.com/watch?v=UwbuW7oK8rk https://www.youtube.com/watch?v=qxXRKVompI8 No low-latency requirement. Interpretability is important. Errors can be very costly. Probability of a data-point belonging to each class is needed. 2. Machine Learning Problem Formulation 2.1. Data Source: https://www.kaggle.com/c/msk-redefining-cancer-treatme
View ProjectCultural Fit Analysis
The candidate's project portfolio shows a strong inclination towards personal learning and exploration in diverse data science domains, which could indicate a proactive and curious mindset. The projects are varied, covering areas like NLP, computer vision (implied by malware detection), and recommendation systems, suggesting a broad interest in machine learning applications. However, the lack of team projects or professional experience makes it difficult to assess collaboration or broader cultural fit.
Soft Skills & Operational Fit
Insufficient data to assess soft skills or operational fit. The candidate's project descriptions indicate an ability to identify and frame business problems for machine learning solutions.