Sei sulla pagina 1di 11

ASSIGNMENT

“Role of Computers in Scientific Research”

Submitted as partial fulfillment of PhD Course Work

Mahatma Gandhi University, Kottayam

Shinto Babu

Research Scholar

Department of Physics

Union Christian College, Aluva


INTRODUCTION

Mere mentioning of the inevitable role of computers and its application in the area of
scientific research would be an understatement. For a minimum of three decades, the
quintessential involvement of computers and everything associated with it has accelerated
research and development, phenomenally.

The term computer itself has been outdated considering the number of roles it plays in
a million diverse fields. The amount of data calculated or computed has taken us literally to
the boundaries of solar system and beyond.

The point of this assignment will be to have a peek at a little bit of the following phases.

1. Literature Survey
2. Planning (Simulation and Modeling)
3. Data Collection
4. Data Storage
5. Data Analysis
6. Instrumentation and Controlling
7. Preparation of draft
8. Publication

Rather than stating the obvious terminologies using for decades, this assignment
envision to introduce the current trends in the use of computers in the area of scientific
research.

1. LITERATURE SURVEY

In the conceiving phase of a scientific research, identifying a relevant research


problem and developing a hypothesis is extremely important as well as difficult. To
single out a problem requires extensive reading. For the same we need to be aware of
the recent developments in the chosen area. Hundreds and thousands of research
paper consisting of the findings are being published worldwide. Essentially the term
“worldwide” indicates the connection of computers through network. The data in one
computer is shared through World Wide Web.
Now the research findings are published in various journals. So the extensive
reading requires access to such a gateway. One such useful yet not legally
authorized, free of cost method is “Sci Hub”. It is not practical to have a paid
membership in every journal web site for individuals, especially students. There comes
the importance of Sci-Hub. We can get the full research paper using URL, PMID/DOI or
simply by keyword search.

Another convenient method is using mobile application namely “Researcher”.


By providing our area of interest using key words and given options, and the journals
we are interested to subscribe, we get notification whenever a new paper is published
in the requested topic and journals.

Another inevitable tool for any research is books. There could be books with its
publication limited to certain countries make it unavailable for buying or getting from
library. There comes the vital role of Libgen (libgen.io), a website widely used. We can
download literally any books (mainly of foreign authors) in pdf or djvu format,
absolutely free of cost.

Once literature survey is completed, and a problem is found out, the next step is planning.

2. PLANNING

In most of the scientific research, mathematical modeling and simulation is


essential. Before going in to the experimental phase, one can model the same
situations using mathematical modeling identifying and varying different physical
variables.

For instance, one has to study the transport of light in tissues. The parameters
involved is thickness of tissue, size of photon beam, number of photons colliding on a
surface per second, number of times it gets reflected before it gets fully absorbed,
refractive index, coefficient of absorption etc. These can be made into a program
easily using Matlab. We can easily obtain the ideal curve even before the experiment
is conducted.
To quote an example, here is a simulation done in Matlab at Optics &
Spectroscopy Lab, Department of Physics, Union Christian College, of the above
example mentioned.
This shows the absorption and reflection of photon packet in tissue.

In a statistical modeling, we can calculate the sample size required for a


proposed study. The standard deviation of the data from the pilot study is required
for the sample size calculation. It can be easily done using excel or any
programming language.
In astronomy where we obtain a huge amount of data from radio telescopes
collaborated with the project, this phase is really important. The same goes with
particle physics experiments.

3. DATA COLLECTION

There is a quote which says “Data is the New Oil”. Oil has evolved the world into
a better place by creating an enormous amount of wealth and prosperity. Data
perhaps holds the similar potential.

Data has become the most valuable resource on the planet. However, it needs
to be ethically extracted, refined, distributed and monetized. Like the way oil has
driven growth and produced wealth for powerful nations, the next wave of growth will
be driven by data.

In productive scientific research, data collection, handling and sharing is driven


by the use of computers.

The data can be obtained by many means.

For survey based research works, nowadays, scholars have been using Google
Forms for a while. The questionnaires can be shared as links. Their response can be
obtained at the same time which then can be analyzed easily.

As mentioned before where our research is collaborated with entities like radio
astronomy centers like IUCAA (Inter University Center for Astronomy and Astrophysics)
or particle physics centers like INO (India-based Neutrino Observatory) or CERN
(European Organization for Nuclear Research), they act like a data hub.

To specify, IUCAA, Pune, conducts the Virtual Observatory project providing


users to access raw observational data along with advanced processing software
designed by engineers at Persitent (a collaborator).

IUCAA, along with the Raman Research Institute and Indian Institute of
Astrophysics, Bangalore, declared a proposal to take a ten percent stake in the Large
Telescope Project, which would allow Indian astronomers access data to major
upcoming observatories such as the Giant Magellan Telescope (GMT), the Thirty Meter
Telescope (TMT) and the European Extremely Large Telescope (EELT).

To elaborate about INO, an estimated cost being 1500 Cr, is a multi-institute


collaboration and one of the biggest experimental particle physics projects
undertaken in India. When completed, the main magnetised iron calorimieter (ICAL)
experiment will include the world's largest magnet, four times larger than the 12,500-
tonne magnet in the Compact Muon Solenoid detector at CERN in Geneva,
Switzerland. The important thing is it has already signed MoU between seven major
project partners and thirteen other project partners. It includes almost all major
research institutions. Their experimental data will be available to all.
4. DATA STORAGE

A common mistake usually seen during research, especially towards the end is,
losing data from laptop/pen drive/external hard disc etc. The data one lost from a
drive can be easily restored using Photorec of Ubuntu package namely testdisk. But if
some error occurs to hardwire, which is highly likely, and we don’t have any backups,
we have to start from scratch.

Hence nowadays, most of the researchers are relying on online data storage
facilities. The advantage is that we don’t need to carry data storing devices with us.
We can access or share data stored in these virtual storing spaces at any place or
time, without having the fear of data lose. The facility which enables us to do the same
is called Cloud Storage.

Cloud storage is a model of data storage in which the digital data is stored in
logical pools, the physical storage spans multiple servers (and often locations), and the
physical environment is typically owned and managed by a hosting company. These
cloud storage providers are responsible for keeping the data available and
accessible, and the physical environment protected and running. People and
organizations buy or lease storage capacity from the providers to store user,
organization, or application data.

Following are few examples.

 Google Drive
Google Drive is a file storage and synchronization service developed by
Google. Google Drive allows users to store files on their servers, synchronize
files across devices, and share files. In addition to a website, Google Drive
offers apps with offline capabilities for Windows and MacOS computers, and
Android and iOS smartphones and tablets.
Google Drive offers users 15 gigabytes of free storage, with 100 gigabytes, 1
terabyte, 2 terabytes, 10 terabytes, 20 terabytes, and 30 terabytes offered
through optional paid plans.
 Dropbox
Dropbox is a file hosting service operated by American company Dropbox,
Inc., that offers cloud storage, file synchronization, personal cloud, and client
software.
Dropbox can create a special folder on the user's computer, the contents of
which are then synchronized to Dropbox's servers and to other computers and
devices that the user has installed Dropbox on, keeping the same files up-to-
date on all devices. Dropbox uses a freemium business model, where users are
offered a free account with a set storage size, with paid subscriptions available
that offer more capacity and additional features.
Dropbox Basic users are given 2 gigabytes of free storage space. Dropbox Plus
users are given 1 terabyte of storage space, as well as additional features.

I would like to discuss specifically about some advancements made in imaging


science which a door is opening to the prospects of study about neurodegenerative
diseases, which involves sharing and distribution of huge data. Why because a typical
MRI Image size starts from 1 GB. If you are looking at different cases of brain, say
epileptic conditions, itself require a large number of samples to study, each having a
size starting from 1 GB. A solution for the same, which changed the realms of radiology
studies, is PACs (Picture Archiving and Communication System) servers.

A picture archiving and communication system (PACs) is a medical imaging


technology which provides economical storage and convenient access to images
from multiple modalities (source machine types). Electronic images and reports are
transmitted digitally via PACS; this eliminates the need to manually file, retrieve, or
transport film jackets, the folders used to store and protect X-ray film. The universal
format for PACS image storage and transfer is DICOM (Digital Imaging and
Communications in Medicine). Non-image data, such as scanned documents, may
be incorporated using consumer industry standard formats like PDF (Portable
Document Format), once encapsulated in DICOM.
Combined with available and emerging web technology, PACS has the ability
to deliver timely and efficient access to images, interpretations, and related data.
PACS reduces the physical and time barriers associated with traditional film-based
image retrieval, distribution, and display.

Another example is The Worldwide LHC Computing Grid (WLCG) project. A part
of potential game changer Large Hadron Collider experiment which can possibly
change the way of Universe the way we look at it.

WLCG is a global collaboration of more than 170 computing centres in 42


countries, linking up national and international grid infrastructures. The mission of the
WLCG project is to provide global computing resources to store, distribute and analyse
the ~50 Petabytes of data expected in 2018, generated by the Large Hadron Collider
(LHC) at CERN on the Franco-Swiss border.

Approximately 600 million times per second, particles collide within the Large
Hadron Collider (LHC). Each collision generates particles that often decay in complex
ways into even more particles. Electronic circuits record the passage of each particle
through a detector as a series of electronic signals, and send the data to the CERN
Data Centre (DC) for digital reconstruction. The digitized summary is recorded as a
"collision event". Physicists must sift through the 30 petabytes or so of data produced
annually to determine if the collisions have thrown up any interesting physics.

5. DATA ANALYSIS

The hard part is analyzing chunks of data obtained from various sources.

Just recently, US has unveiled a super computer namely Summit. It resides in a


facility at the Oak Ridge National Laboratory. It can perform around 200 quadrillion
calculations per second, which makes it about a million times faster than a typical
home computer and about twice as fast as the current world record holder. It’s made
of around 37,000 processors, and takes up as much space as two tennis courts.

- This is how far our requirements have reached at the analysis of data we obtain.
For small scales of data, any spreadsheet packages like MS Excel will do.

They are equipped with a handful of mathematical functions. With a slight grasp in
mathematics, we can write our own formula to find mean, mode, standard
deviation, variance etc., from a given set of raw or classified data.

If you move on to solving say “n” number of unknowns using “n” linear
equations, using pen and paper, it is a tedious process. Beyond n=3 itself is really
difficult. But with some knowledge in any programming language like C++ or Python,
you can solve them easily using computers.

 C++: is a general-purpose programming language. It has imperative, object-


oriented and generic programming features, while also providing facilities for
low-level memory manipulation.
 Python: is an interpreted high-level programming language for general-
purpose programming.

When you are dealing with big data analysis, R Programming language is the
most useful one.
R is a language and environment for statistical computing and design. It is a
GNU venture which is like the S language and environment which was created at Bell
Laboratories by John Chambers and Associates. R Programming has brought
revolutionary modifications in Big Data Analytics and other aspects of data analytics
and data science
It is a free software environment for statistical computing and graphics that is
supported by the R Foundation for Statistical Computing. The R language is widely
used among statisticians and data miners for developing statistical software and data
analysis.

There will be numerous situations when we will be met with plotting graphs and
fitting curves during our research.
To quote a few examples, the sophisticated experimental techniques like XRD,
Raman, UV Visible spectroscopy etc., are equipped with interface to computers. What
we obtain as a result will be thousands of data points ready to be plotted in a graph
to obtain useful information.

There are a lot of software which helps you to contemplate data into useful
results by plotting them.

To name a few, we have, Origin, SigmaPlot, KPlot, TeraPlot etc.,

Origin: Origin is the data analysis and graphing software of choice for over half
a million scientists and engineers in commercial industries, academia, and
government laboratories worldwide. Origin offers an easy-to-use interface for
beginners, combined with the ability to perform advanced customization as you
become more familiar with the application.

Origin graphs and analysis results can automatically update on data or


parameter change, allowing you to create templates for repetitive tasks or to perform
batch operations from the user interface, without the need for programming.

The graph on the left one was plotted using Origin, in Optics & Spectroscopy
Lab, of Department of Physics, Union Christian College.
6. INSTRUMENTATION AND CONTROLLING

As mentioned before, most of the equipments associated with experiments are


equipped with Graphic User Interface, underlying the role of computers in research.
For instance a typical spectrophotometer provided by Ocean Optics, a distributor, is
equipped with paid software namely OceanView. It makes the data handling and
analysis, easy.

Rather than the data handling, equipments are controlled by dedicated


computers. For example the Spray Pyrolysis Coating Unit of Department of Physics, has
a dedicated computer, controlling its functions using pre programmed interface. And
that is just the simplest use of dedicated systems.

7. PREPARATION OF DRAFT

Usually we rely on MS Word, which has more than enough facilities provided for
a layman. But when we have to deal with drafting rigorous mathematical equations or
scientific formulas enriched with alphabets borrowed from every known language, it is
best to use TeX.

It is a document preparation system. But it is more like learning a programming


language. TeX was designed with two main goals in mind: to allow anybody to
produce high-quality books using minimal effort, and to provide a system that would
give exactly the same results on all computers, at any point in time.

It could be difficult while learning it at first, but once we get the hold of it, just like
any other language, we will never used custom packages given ready-made, like MS
Word.

8. PUBLICATION

Not to mention, computers have made it so easy that all you need to do to publish
your genuine work, is to upload in any journals.

****************************************************

Potrebbero piacerti anche