Sei sulla pagina 1di 7

Methodology

The data that was given to us was in CSV & Excel Format. The dataset listed some questions
which ranged from a student’s comfort level in Stastics to how well the students are able to
understand the instructions of their instructor. In order to effectively interpret the data given,
we decided to group the variables using factor analysis. These variables would ultimately tell
us the variables responsible for influencing the fear in students about Statistics as a subject.
Steps taken before factor Analysis:
1. Data Cleaning: For an efficient factor analysis, we only used data which R can process and
give meaningful results. Thus we did not make use of Question 10 because it was an open
ended question and would not yield a meaningful result in factor analysis. Also we removed
initial columns like Respondent Id, Collector Id, Start date and End Date from the CSV file
before running the R Code.
2. Gap filling: We did not remove the missing values. Instead, we chose to fill those gaps by
using the (norm.predict) function which uses regression to predict those missing values.
3. Nomenclature of variables: We changed the data labels of the variables to facilitate an easier
interpretation of the data.
4. Packages: We added packages like mice and psyche to help run the code easily.
Factor Analysis:
1. MSA Value: These MSA values are calculated as part of the KMO test to know about the
appropriateness of the correlation between independent variables.
2. Factors: The factors which had an eigen value greater than 1 were taken as factors
influencing the dependent variable.
3. Factor Loadings: We printed the factor loadings in accordance to the variances found. We
chose the cut-off to be 0.5 as it was covering a wide range of variables & gave clear
demarcation between the factors.
Findings and Insights
A student’s learning curve can contribute to his/her fear of the subject as if their learning
curve levels are high, they would be less fearful of the subject and vice versa. Similarly,
higher interest levels in the subject can contribute to less fear of the subject among the
students.
Factor 1
We identified factor one as Student’s learning curve in Statistics which is being affected by
a number of independent variables. Learning curve is generally defined as the rate at which a
person progresses in gaining a new skill or experience. The relationship of learning curve
with independent variables is defined as follows:
1. Stats_ComfortLevel – A student’s comfort level in Statistics positively influences thei
r learning curve in the subject. Since there is high positive correlation between the tw
o, a higher comfort level in statistics would lead to higher learning curve.
2. Instructor_Understanding – The understanding of instructions given by the instructor t
o his/her students is positively correlated with a student’s learning curve. If the instruc
tions are not correctly understood by the students, then their learning curve would be
negatively affected.
3. Instructor_Approachability - The easy approachability an instructor to his/her students
as perceived by the students is positively correlated with a student’s learning curve. If
the instructions are not correctly understood by the students, then their learning curve
would be negatively affected.
4. Instructor_Engagement – The ability of the instructor to engage his students in the cla
ss would positively affect the learning curve of the students.
5. MathLanguage_ComfortLevel – A student’s comfort level with basic Maths language
helps him/her understand the concepts being taught in the class of Statistics. Thus ther
e is a positive relationship with comfort level in Maths Language and the student’s lea
rning curve
6. Instructor_FigureUsage – If the instructors makes use of diagrams or figures to explai
n concepts to the students, then it makes it relatively easier to understand those conce
pts. Thus there is a positive relationship between Instructor’s use of figures or diagra
ms and a student’s learning curve.
7. DataVisualization_Ability - A student’s ability to visualize helps him/her understand t
he concepts better. Thus, there is a positive relationship with comfort level in Maths L
anguage and the student’s learning curve
8. Students_BetterInStats – If the students perceive that there are a large number of stud
ents in the class who are better than him/her in Statistics, then this can negatively affe
ct his learning curve. The reason for this is the mental barrier that the student forms in
his/her mind about their ability to perform in the subject.
9. ReadingMaterial_Helpful – If a student finds the reading material provided to him is h
elpful, this would be conducive to his learning. Thus the helpfulness of the reading ma
terial has a positive relationship with a student’s learning curve in Statistics.
Factor 2
We identified factor two as Student’s interest level in Statistics which is being affected by a
number of independent variables. The relationship of the factor with independent variables is
defined as follows:
1. Students_in_First_class- The number of students coming to the first class of Statistics
is telling of their interest levels in the subject of Statistics. Thus a high number of stud
ents attending the first class would indicate high interest level among students and thu
s there is a positive relationship between the two.
2. Programming_Proficiency – If a student is highly proficient in the programming, his/h
er approach would be restrictive towards learning in class because by now, they must
be fixed in their approach to the subject and might face difficulty in the way the subje
ct is being taught in class.
3. Stats_Software_Proficiency - If a student is highly proficient in the statistics software,
his/her attitude would be limiting towards learning in class because by now, they must
be stuck in their approach to the subject and might face difficulty in the way the subje
ct is being taught in class.
4. Total_stats_course_taken – If a student has taken greater number of statistics courses,
it indicates the student’s inclination towards the subject. Thus it means that there is hi
gh correlation between the number of courses taken in Statistics and a student’s intere
st levels in the subject.
Exhibit 1 : Change in Data Labels

Exhibit 2: Correlation Matrix

Exhibit 3: KMO Test


Exhibit 4: Bartlett Test

Exhibit 5: Scree Plot

Exhibit 6: Factor Loadings


Exhibit 7: Factor Analysis Diagram

Potrebbero piacerti anche