What is data science?
Data science is a process which combines both predictive and prescriptive analytics. It’s main aim is to analyze the data to uncover any hidden patterns, any requisite information, which can further be leveraged to ace business goals. Furthermore, it also prescribes a way forward in order to fix the broken.
A data scientist often has to wear multiple skill hats. The various hats include – data engineering, knowledge of scientific methods, mathematics, statistics, advanced computing, dashboarding and visualization, mindset of a hacker and domain expertise.
It is often the case that not all the above skills are possessed by one individual. This is exactly the reason why we cannot define data science in a standard manner. However, if you find yourself a data scientist who has all the above-mentioned skills, be grateful because you just found a unicorn.
As a data scientist with 3 years of experience, I think the domain of data science stands on four pillars.
- Business or domain knowledge
- Mathematics (which includes concepts of both statistics and probability)
- Computer science (which includes sound knowledge of programming)
- Communication (which helps to become a profound story-teller)
How I became a data scientist?
All of it started back in my graduation in 2016. We used to get dissertation projects every semester. Every other student in my class used to perform qualitative analysis, i.e. used to read about that topic over the internet and described it accordingly. Me and my friends were the only ones who used to make a questionnaire over the topic and distribute them in college in order to record first hand responses of individuals. Further to which, we used to use SPSS to conduct various statistical analysis, hypothesis testing, etc. All of this was so fascinating I just had to explore more.
Luckily, I found one faculty who was equally data knack like me. We both used to spend hours and hours learning about concepts over the internet and YouTube. He helped me to first gain complete knowledge over statistics. We started right from the very basics, i.e. sample v/s population to advanced concepts such as confidence interval estimation, hypothesis testing and AB testing.
Afterwards, I took two courses over Udemy on python programming and MYSQL and practiced a lot over them. During my graduation, I also did a PGDM in data science which helped me to consolidate all my learning and enabled me to advance into machine learning and NLP. The journey was challenging and beautiful altogether as it helped me grow into what I am today.
Why is data science important?
The importance of data science can be traced by the strong deliverables which data scientists deliver. Few of these deliverables include:
- Pattern recognition
- Anomaly detection
The above metrics provided by mcKinsey clearly define the optimal amount of benefit derived by each use case of data science.
As I love telling stories, let me give you a glimpse of one of my data sciences projects. I was once working with a digital marketing giant based out of the U.S. (can you guess the name?). They received a project of a healthcare insurance provider which then was passed onto my team in India. The insurance provider organization wanted to roll-out an upgraded version of their insurance policy due to out-break of Covid-19 pandemic. They wanted to understand how the market would respond to the upgraded policy and whether they should roll it out in the first place.
We first got our hands dirty over the data they provided. We understood few of the important insights such as:
- Individuals who have higher aged dependents have shown very much eagerness in the past towards new policies in the market, thus they become a strong sales point.
- Individuals who have dependents living away from them have opted for various new and better policies in order to keep their loved ones secured.
Based on the dogged insights and inputs from the business team of the digital marketing giant, we created a dry campaign. This campaign was intended to understand as to how the market will respond. Whether they will opt to buy it in the first roll-out. Whether they will be open for a discussion to understand the policy better and also the possibility of conversion rate. All of this information was recorded during the dry run and we obtained our target variable. In the target variable we had two labels. The label “yes” were the records wherein individuals have shown positive behavior in terms of buying the policy and “no” for not being interested in buying the policy.
In this phase of the project, we stepped onto creating a logistic regression model. The model was used during the actual roll-out campaign of the policy. The individuals for whom the model has predicted “yes” were pitched elaboratively. And the individuals for whom the model predicted “no” were pitched on a frequent basis.
Well, to our astonishment, the insurance health provider got a huge benefit. They made a hefty amount of money and also rolled out a very successful product.
So, you see, data science is extremely important when it comes to businesses. Do you have any success stories about data science projects? Can you think of any organization who is playing solely on the grounds of data science?
How learning data science has helped in my career?
I believe learning data science was one of the best decisions I’ve ever made. Every time I work on any project, I feel a sense of responsibility. My deliverables and my output will be addressing someone’s business need is one of the best things to see in real-time.
Harvard has marked the data science profession to be one of the sexiest jobs of the 21st century. The field is growing daily and has huge scope in almost all business verticals and other industry domains. Healthcare, real estate, finance sector, manufacturing, and every industry domain holds the potential to leverage data science for their own good and better future.
…. PS – the name was Merkle.inc