Sei sulla pagina 1di 5

A Predictive Analysis of the Mean Month Interval

between Graduation and First Job Employment of


UST IICS and Engineering Students
Baltazar, Gerard Alvin G. Duro, Kenny Francis O.
Institute of Information and Computing Sciences, Institute of Information and Computing Sciences,
Department of Information Systems University of Department of Information Systems University of
Santo Tomas España, Manila Santo Tomas España, Manila
+63 9154412525 +63 9063410973
2014067573@ust-ics.mygbiz.com 2015081695@ust-ics.mygbiz.com

Dela Cruz, Vince Cedrick V. Ngungie, Dennis M.


Institute of Information and Computing Sciences, Institute of Information and Computing Sciences,
Department of Information Systems University of Department of Information Systems University of
Santo Tomas España, Manila Santo Tomas España, Manila
+63 9567550386 +63 9273627822
2015081694@ust-ics.mygbiz.com 2013055153@ust-ics.mygbiz.com

Khrisnamonte M. Balmeo
Institute of Information and Computing Sciences,
Department of Information Systems
University of Santo Tomas España, Manila
+63 9553763844
kmbalmeo@ust-ics.mygbiz.com

ABSTRACT forecasts do not display noticeable differences from the last


One of the areas being checked by the international actual values. Single exponential smoothing was found to be
accreditation bodies is the employability of the graduates of the suitable method for forecasting. This work and its future
the academic institutions. Specifically, graduates must be application will be beneficial to the OAR, the students, and
employed within six months after completion of their degree the future researchers.
programs and their jobs must be in line with their degrees.
Hence, the University of Santo Tomas Office of Alumni CCS Concepts
Relations (OAR) is keeping track of the alumni record
Information systems---Information systems applications---
particularly the employment history. The project aims to
Decision support systems---Data analytics
create an analytical dashboard that provides predictive and
descriptive analysis using the accumulating Alumni Mathematics of computing---Probability and statistics ---
Employment History data gathered by OAR. Alumni data of Statistical paradigms---Time series analysis
Institute of Information and Computing Sciences (IICS) and Information systems--- Information systems applications---
Faculty of Engineering of the Alumni Information System Data mining --- Data cleaning
(AIS) were extracted from the database of the university and
processed through an automated data transformation Keywords
process, loaded to the client’s new database, and used in
data visualization tool. With the use of visualizations, work Predictive; Alumni; Analytics;
sector and job alignment of the alumni were identified. Point
Biserial Correlation technique was performed to identify 1. INTRODUCTION
among attributes namely gender, location, nationality and
achievements particularly Latin honors and dean's list highly Office of Alumni Relations has become essential to the
affect the interval time of an IICS and Engineering University’s competence in the last decades. The asset of
graduates. The forecasting method is executed in Rstudio human capital and monetary resources in Alumni Relations
and loaded to Power BI for visualization. Single Exponential prepares the university with more convenience for
Smoothing and Double Exponential Smoothing were used scholarships, research projects, laboratories, libraries, new
as forecasting models. The outcome reveals the month programs, endowments, and new capital. Alumni are also
interval forecast value for Engineering that got lower score, the primary nominees in becoming parents which can be of
and the forecast for the month interval for IICS remained the help to the University when it comes to increasing the
same. Using an optimized smoothing constant the current number of incoming students (Cummins 2013). In addition,
graduates can also be good resources for present 1.3. Conceptual Framework
students regarding career guidance, work experience
opportunities, and workplace mentoring. Alumni Data is
The researchers gathered the data from two sources AIS and
also a big factor in establishing a good graduate tracer
STEPS. The data from the AIS are from the internal source
with the alumni’s job status. Knowing the current job status
which is the Office of Alumni Relations, it contains the Alumni
of an alumnus can help the University in monitoring the
registered in the system, it holds the personal information,
employment of its alumni from graduation to their first job
contact information, employment history, awards, and
experiences, it will be a great factor in determining the
recognition. Refer to the diagram below.
effectiveness of the curriculum, accreditation of programs,
and teaching strategies in preparing the students for
employment.

1.1. Purpose and Objective

The main objective is to create an analytical dashboard that


provides predictive and descriptive analysis in anticipation
for the accumulating data gathered by OAR that can be
beneficial to colleges, faculties, and institutes of the
University of Santo Tomas.To meet the general objectives, Figure 1 Conceptual Framework of the study
the researchers focused on the following:
Oracle is composed of all the graduates from the previous
1. Identify what attributes highly affect the interval time of batches until the present classified per batch because until
IICS and Engineering graduates in terms of getting their first now STEPS does not have a feature wherein they can extract
jobs using a Point Biserial correlation technique; the data per semester. The Oracle data were extracted in
2. Identify what sector, either private or government the UST PDF format while the AIS data were extracted in .csv format.
IICS and Engineering graduates are affiliated after In the ETL process, data cleaning and data consolidation
graduation; occurred in RStudio after which, the selecting of the important
3. Identify if the career of the IICS and Engineering variables for the analysis took place and an ETL report was
graduates are in line with the major they took; and generated in .xlsx format. Then it undergone data masking to
4. Predict average interval time between graduation and first hide any vital information of the alumni which is then stored in
job employment of IICS and Engineering graduates using MySQL for backup because of the accumulating data of
time series analysis. OAR. R Studio and Power BI are used to explore the data
that provided results and visualizations.
1.2. Scope and Project Constraints

The project is focused on Engineering and IICS graduates 2. METHODOLOGY


from years 2014 to 2018. The data that came from OAR
ETL Process
include Alumni data from the AIS (Alumni Information
The researchers gathered the raw data from AIS in .csv
Systems) and Oracle. The graduates with “Employed” status
format and then deliver to the ETL Process with the use of
were covered by the project and the data that were collected
Rscript and .bat file for automation. The initial process is that
from AIS. Graduates who have not registered and filled-up
removing the null values and the unnecessary attributes
the employment history form are not covered by the project.
from the data. The researchers then rename the attribute for
The researchers did not collect any factors regarding
standardization and derive a new attribute called
student’s general weighted average since it is confidential.
Month_Interval by getting the difference of the date of their
Soft skills were not taken into consideration since it is
graduation and their date of first job employment, also, the
beyond the data. Those who filled and registered but did not
researchers created another attribute called Age_Graduate
fill their employment history are disregarded in the analysis.
by differentiating their birthdays and dates of graduation.
Batch 2018 of engineering students have not yet filled the
From the derived attributes, the researchers detected
employee history in assumption that the graduates are
negative values from Month_Interval and Age_Graduated
preparing for the board exam and remained unemployed
caused by alumni’s error and they were eliminated. The
yet. This project uses two models only. Single Exponential
researchers then reordered their attributes to their
Smoothing and Double Exponential Smoothing. S.E.S is
respective order and export a .csv file and consolidated to
used because the present data has no trend and
other clean data and delivered dirty data reports.
seasonality. D.E.S is used assuming that there is a trend
when the data points increases.
are taken into account in the analysis. Moreover, this also
contains how the amount of graduates who have in line and
not in line jobs would affect the average interval time and see
the average interval time of graduates who have in line and
not in line jobs per program.

Since the data is accumulating, 26% for IICS and 19% for
Engineering was the latest percentage of data that were used
by the researchers in the analysis. 74% for IICS and 81% for
Figure 2 Newly Consolidated Data
the college of Engineering have not yet filled-up their
Figure 2 shows the output of the cleansing and employment history. Based on the analysis from this
consolidation process. dashboard, it was shown that SES was a more reliable
method with a 17.51% MAPE for Engineering and 39.65% for
IICS compared to DES with 27.50 MAPE for Engineering and
Business Analytics Model and Testing 46.78 MAPE for IICS. As observed, there is a huge difference
between alumni that have in line and not In line jobs. The total
Single Exponential Smoothing (SES) and Double amount of alumni that have in line jobs appear to be much
Exponential Smoothing (DES) are the forecast models. The higher compared to those who are not. In addition, based
researchers derived the mean month interval of the alumni from the dashboard, it can be shown that graduates that got
per year, then the mean month interval and batch were in line jobs have faster average month interval than those
assigned to be the variables for the forecasting models. The who have not in line jobs.
package for the forecast will be installed in Rstudio. The
data will be split into two parts. 80% of the actual data will be
alloted for training, and 20% for testing. In order to
determine the accuracy of the models both SES and DES
will be performed in the training data set which allows the
user to determine the reliability of the models by comparing
the results of the training data and the testing data.
Afterwards, the researchers will perform Single Exponential
Smoothing and Double Exponential Smoothing using the
actual data set. The forecast values of each models shall be
displayed, along with the accuracy of the data, showing the Figure 8 Correlation Dashboard
errors of the forecasts for decision-making. Illustrated in Figure 8 are the point biserial correlation plot and the
clustered column charts that are categorized by attributes. The
dashboard contains visualized reports to identify how the attributes
namely the dean’s list, Latin honors, nationality, location, and gender
3. RESULTS AND DISCUSSION of a graduate affect and correlate to the average month interval
whether each of these attributes can contribute to decreasing the
amount of time a graduate can get his or her first job after graduation.
This dashboard can be filtered by college and by batch. It can be
observed in the dashboard that month interval has a negative
relationship with Latin honors and dean’s list. This states that as
alumni with Latin honors or dean’s list increases, the month interval
decreases in which these are found to be as one of the attributes that
can make an alumni have faster interval in getting a job. It can also
be seen that there is a big difference in the average month interval
between graduates who had achieved to be with Latin honor or a
dean’s list.

Figure 7 Time Series and Job Alignment Dashboard

The dashboard in Figure 7 contains visualized reports on


how long did the graduates for each college in an average
took of getting a job from the year 2014 to 2018 and what
would be the forecasted value based on the single
exponential smoothing (SES) and double exponential
smoothing(DES) forecasting method. This also involves the
mean absolute percentage error (MAPE), mean absolute
deviation (MAD), and mean squared error (MSE) of the two
forecasting method that serve as a reference to see the
accuracy and what method is more reliable. The dashboard Figure 9 Dichotomous Attribute Dashboard
has also an indicator to see how much percentage of the
data was used since only alumni with employment history
Illustrated in Figure 9 are the one hundred percent Latin honors highly contribute more than the Dean’s Lister.
stacked bar of the work sector of graduates per program Single exponential smoothing was found to be a more
and the one hundred percent stacked bar per attribute suitable model for forecasting than double exponential
namely Latin honors, dean’s list, job matching (inline/not smoothing in which SES showed a much better result of
inline), gender, nationality, and location whether they live accuracy.
in National Capital Region or not. This dashboard can be
filtered by batch, college, and most especially by month 5. RECOMMENDATION
interval (0-3 months, 4-7 months, 8-11 months, and 12
months greater). The dashboard contains visualized For maximizing the use of Alumni Information System in
reports in identifying which work sector are graduates terms of data gathering, the researchers will recommend
affiliated whether private or public. Identification of the adding the following features on their website. The
amount of percentage from the total students based from researchers propose adding the questions "How months did it
the monthly interval filter are composed from each take you before you actually tried applying for a job?" and
attribute. This enables OAR to generate reports that can "Are you employed in a local or international company?".
compare the graduates who had faster interval time and (3,4,13) Adding a drop down list for Region and City in order
longer interval time in getting a first job employment. to determine the location of the alumni, drop down list for
programs that have tracks or sub-programs, and another drop
4. CONCLUSION down list for college.(5) As the Dean's List and Latin honors
data are taken from the office of IICS and ENG, including a
4.1 Descriptive adding a Dean's List and Latin honors to the site may help in
reducing the difficulty in data collection.(6) To determine the
Industrial Engineering graduates take shorter time in getting irregular students/Octoberians, making a function where the
their first jobs after graduation than the graduates of other alumni may be traced per batch and not only per school year
programs being offered by the UST Faculty of Engineering is advisable. Data granularity is helpful in determining a more
while the Civil Engineering graduates take longer time. For detailed behavior of the data (8,9) Including the medical
the Institute of Information and Computing Science the history and medical condition of an alumni are also attributes
fastest duration of finding a job is the Information to consider adding since they may affect the month interval.
Technology and they would only find a job within 4 months (14) To know when an alumni exactly graduated, adding a
and the longest period is the Computer Science that has an month and year as to when they graduated may prove useful.
interval of 5 months. In addition, the result shows that (15) Lastly for the website, adding an option to answer
majority of Faculty of Engineering graduates and Institute of whether the job they took is in line or not in line would be
Information and Computing Sciences are in line with their job convenient.(7 )Aside from the additional features to be added
and working in the private sector. The researchers found out in the website, the researchers highly recommend the OAR to
from the point biserial correlation analysis that for IICS require the alumni to update the employment history in order
among the attributes, both dean’s list and Latin honors were to extract more reliable data. (10)For analyzing the alumni
found to have a significant relationship with having a faster which have 0-3 month interval, the OAR may refer to the
month interval while for Engineering only Latin honors had a technical attributes that can contribute to the early
significant relationship. This can be observed that employment of the graduates.(11) Based from the results, the
Engineering students do not solely depend on the academic attributes of the Dean’s Lister and Latin honors highly affects
achievement that would also depend on the results of their the early employment of the IICS alumni, while for
board exam in order to get a job faster. Furthermore, a Engineering, that Latin honors highly contributes more than
future study is still needed as OAR's accumulating data will Dean’s Lister.
increase and attributes that were recommended in gathering
additional information to the site of Alumni Information
Systems may be added to the study in which this will lead to 6. ACKNOWLEDGEMENTS
new results and broader findings. Achieving a Capstone project needs a huge amount of data,
heart, and time. The researchers will not accomplish this
study if it was not for the people who helped along the way.
4.2 Predictive
The researchers would like to express their utmost gratitude
to the Office of Alumni Relations for their cooperation regards
The forecast suggests that there is a minimal decrease in
to our Capstone Project. This project would not be possible
the average time interval on graduation and their first job
without our client’s support. The researchers would like to
employment from year 2018 to year 2019 for Engineering.
thank the technical adviser, Mr. Khrisnamonte Balmeo for
The forecast implies that there are no changes in the
lending us his time and effort in this project. The researchers
average time interval on graduation to first job employment
would like to thank their Capstone Coordinator, Asst. Prof.
from year 2018 to year 2019 for IICS. For the analysis, the
Divinagracia R. Mariano for assisting and sharing her
alumni which have 0-3 month interval, the OAR may then
expertise in the field of analytics. The researchers would also
refer to the technical attributes which can contribute to the
like to thank Asst. Prof. Bernard Sanidad, Asst. Prof. Arne
early employment of the graduates, which is associated with
Barcelo, Asst. Prof. Cabero, and Asst. Prof. Christopher D.
the descriptive analysis. Based from the results, the
Ladao for standing as the panel members, providing
attributes’ Dean’s Lister and Latin honors highly affect the
criticisms and areas of improvement for our study.
early employment of the IICS alumni, while for Engineering,
7. REFERENCES [5] Box, G. E. (2015). Time series analysis: Forecasting
and control. Retrieved November 3, 2018, from
[1] Nadu,T. (2017) Alumni Interaction System. Vol 5, https://books.google.com.hk/books?hl=en&lr=&i
Issue 2. Retrieved from International Journal of d=rNt5CgAAQBAJ&oi=fnd&pg=PR7&dq=Time-
Computer Science Trends and Technology. series+analysis+interval&ots=DJ51tSl1TE&sig=EQa
Retrieved November 10, 2018, from wp78GTh5nxmbKJCD1ZAaBfW0&redir_esc=y#v=o
http://www.ijcstjournal.org/volume-5/issue- nepage&q=Time-
2/IJCST-V5I2P58.pdf series%20analysis%20interval&f=false

[2] Misra, R. K. and Khurana, K. (2017) Employability [6] Voinageagu, V., Caragea C., and Pisica, S.
Skills among Information Technology (2012) “Forecasting monthly unemployment
Professionals: A Literature Review. Vol 122 pp by econometric smoothing techniques”
63-70. Retrieved from
https://www.sciencedirect.com/science/article/ https://www.researchgate.net/publication/286
pii/S1877050917325711 622452_Forecasting_monthly_unemployment
_by_econometric_smoothing_techniques
[3] Zaid, M. (2015). Correlation and Regression
[7] Zaid, M. (2015). Correlation and Regression
Analysis. Retrieved February 1, 2019, from
Analysis. Retrieved February 1, 2019, from
http://www.oicstatcom.org/file/textbook- http://www.oicstatcom.org/file/textbook-
correlation-and-regression-analysis-egypt- correlation-and-regression-analysis-egypt-
en.pdf?fbclid=IwAR1Eb-PL- en.pdf?fbclid=IwAR1Eb-PL-
x3UX7HWgfAyDPds9vu- x3UX7HWgfAyDPds9vu-
l_8l4oGWxifhC_yyHKzs4YIM3oeoIJk l_8l4oGWxifhC_yyHKzs4YIM3oeoIJk

[4] Das, S., (2016) Data Science: Theories, Models, [8] Ravinder, H. V. (2013). Determining The Optimal
Algorithms, and Analytics. Retrieved November Values Of Exponential Smoothing Constants –
10, 2018, from Does Solver Really Work, 6(3). Retrieved March
https://srdas.github.io/Papers/DSA_Book.pdf 16, 2019, from
https://files.eric.ed.gov/fulltext/EJ1054363.pdf.

Potrebbero piacerti anche