Sei sulla pagina 1di 11

TITANIC DATA ANALYSIS

➢ ABSTRACT :
The RMS Titanic was a british passenger liner that sank in the
North Atlantic Ocean in the early morning hours of 15 April
1912, after it collided with an iceberg during its maiden voyage
from Southampton to New York City. The broader goal is to
provide other aspiring data scientists e when analyzing new
data.with a cleanly coded view of data analysis. The plan is to
explain topics so that people can understand my thought
process and the general flow that i use when analyzing new
data.
➢ ANALYSIS OF PROJECT :
❏ I will be analyzing Titanic data which contains
demographics and passenger information that whether
they survived or died.
❏ The analysis of Titanic Data is shown below:-

❖ The source code for analysis of Titanic:


EXPLANATION OF SOURCE CODE :
1. Pandas :- Pandas is the most widely used tool for data munging.It contains high level
data structures and manipulation tools designed to make data analysis fast and easy.
2. Matplotlib :-Matplotlib.pyplot is a collection of command style functions that makes
math plot lib work like MATLAB.
3. Numpy :- Numpy is a library for the Python programming language, adding support
for large,multi-dimensional arrays and matrices, along with a large collection of high
level mathematical functions to operate on these arrays.
4. Sea born :- Sea born represents the data distribution by forming bins along the range
of the data and then drawing bars to show the number of observations that fall in each
bin.
5. Ensemble :- Ensemble learning helps improve machine learning results by combining
several models. Ensemble methods are meta-algorithms that combine several machine
learning techniques.
6. Model_Selection :- Model selection is the process of choosing between different
machine learning approaches e.g., SVM, Logistic Regression.
❖ The source code of analysis of how many male and female
survived:

❖ The output of analysis:


❖ The source code for analysis of Titanic July 2015:

❖ Output of Titanic July 2015:


❖ The source for analysis of Titanic August 2017:

❖ Output of Titanic August 2017:


❖ The source code for analysis of Titanic March 2016:

❖ Output of Titanic March 2016:


❖ The source code for analysis of Titanic:

❖ Output of Titanic Analysis:


➢ RANDOM FOREST CLASSIFIER:
❖ Random Forest is a flexible, easy to use machine learning
algorithm that produces, even without hyper-parameter
tuning, a great result most of the time. It is also one of the
most used algorithms, because it’s simplicity and the fact
that it can be used for both classification and regression
tasks. In this post, you are going to learn, how the random
forest algorithm works and several other important things
about it.
❖ Random forests, also known as random decision forests, are a
popular ensemble method that can be used to build predictive
models for both classification and regression problems.
➢ CONCLUSION:
❖ This project aims to find factors that may affect survival
probability of individual passengers and crew when
disaster happens.
❖ Random Forest model highlights the importance of
predictors sex, Pclass, Fare and Age.
❖ After analyzing all the models i can conclude that
predictors Sex, Pclass, Age did played a major role for
Titanic survivors.

Potrebbero piacerti anche