Sei sulla pagina 1di 8

Data Science

DATA SCIENCE

Under the guidance of Dr.Rajagopalan

Data Science

Table of Contents

Abstract--------------------------------------------------------------------------------------3 Introduction---------------------------------------------------------------------------------4-5 Functional areas of Data Science---------------------------------------------------------5-6 Applying Data Science---------------------------------------------------------------------7 Conclusion-----------------------------------------------------------------------------------8 References--------------------------------------------------------------------------------------------------8

Data Science

Abstract

This paper provides the basic knowledge of Data Science. It provides the few organizations like Google, iTunes where the Data science technology has been used. It discussed briefly about the functional areas of the Data Sciences and gives a brief idea about each functional area. The functional areas are discussed very briefly and it is the one of the main things to be known for everyone who wants to learn about Data science. These functional areas are discussed by the Data scientist Mr.Patil, he is first man who explained about the concept of the Data sciences. We have taken an example of the Forward Internet group which is using the concepts of Data Sciences in their websites and helping the users to get the appropriate products and increased their sales. We also gave a brief explanation about the tool called Hadoop. Hadoop one of the popular tool which is used for the Data analytics.

Data Science

Introduction: Data science is used to obtain and control the data in the efficient way. In this competitive world it is necessary to obtain the new products in the efficient way with good quality. The future belongs to the companies and people that turn the data into products (Loukides, 2010). Data Science can be defined as a combination of data analysis and problem solving. The web is complete of e-commerce applications, and almost every e-commerce application is a data driven application. All the web applications is based upon a 3-tier architecture. It consists of the database, front end tool and the middle ware which acts as interface between the frontend and the database to communicate the data. This middle ware will communicate with different types of databases like banks, credit card processings and other organizations. Data science is a process of acquiring the value from the data and creates more data as a result. Its not just an application with data, its a data product. Data science enables the creation of data products (Loukides, 2010). Examples of Data science: CDDB is one of the earlier data products which has been using in the Web. In this every CD has a unique signature and each track in the CD has the exact length. These are built on database track length and attached data base of album metadata. iTunes is an example for this for CDDB. In this software it tries to read the length of the data and then send to CDDB. This helps

Data Science

to show the track titles. In iTunes we can load music, data or any type of audio in it, CDDB creates a unique value for the each track that has loaded into it. Their business is fundamentally different from selling music, sharing music, or analyzing musical tastes (though these can also be data products). CDDB arises entirely from viewing a musical problem as a data problem (Loukides, 2010). Google is one of the companies which use the data products. For example Google search engine could use input other than the text on the page. Googles PageRank algorithm was among the first to use data outside of the page itself, in particular, the number of links pointing to a page. Tracking links made Google searches much more useful, and PageRank has been a key ingredient to the companys success. (Loukides, What is data science, 2012) Google also helps in correcting the spells given by the user. If tries to suggest the correct spellings from the misspelled words given by the user. This has been achieved by building the dictionaries of common misspellings and the situations they arise. Data science helps companies to make sense of huge data in the form of digital information that they collect every day. Data science can used in order to generate internal sales reports to customer tweets. Data Scientists are experts in data science. Companies like Google, Amazon are having huge number of data scientists to maintain the raw data they have. Nowadays, some companies like wal-mart and Foursquare are also appointing data scientists.

Functional areas of Data Science: According to D.J. Patil, Data science consists of four functional areas. They are 1) Business intelligence. 2) Fraud and security.

Data Science

3) Data infrastructure. 4) Product and marketing analytics. Let me explain briefly about the above terms. Business intelligence is the process of understanding the requirements of the user and the usage of the users. They will try calculate the percentage of the users for the particular product. They will use different types of reporting tools and the dash boards to analyze the things. These guys use the tools like A/B testing also called as split testing or bucket testing. This is a type of marketing testing, which compares with different samples and tries to improve the response rates. These guys help to know the exact requirements of the user and try to give the better output. The other functional areas of the Data science are the Fraud, Abuse, Risk and Security. This defines against the dark arts. It should be highly secured from the intruders and the hackers, who try to hack the secured information. The other functional area is the Data infrastructure. This is one of the important tool in the organizations (data science only in terms of organization) it should have it own data warehouse, this is used to clean, transform the data to all the authenticated users for their business. Data ware house helps people to take the decisions. These people have to make a tool which helps the organization and has to develop the distributed systems in order to communicate the data in the heterogeneous networks. The other functional tools of the Data science are the Product and marketing analytics. This will help the organization to analyze the user experience and the value proposition. Every organization needs to know what the user needs and has to develop different types of products accordingly. Data science helps the users to funnel into different types of products and helps to prevent from the dead end flows. It also helps the users with different types of searches and the appropriate recommendations depending upon the user requirements.

Data Science

Applying Data Science

I would like to give an example by using the Forward Internet group. Its the digital agency which compares the prices in the site uswitch.com. It helps the users to compare the prices from the different data bases or different websites and helps the users to select the product. According to Farquhar a lead developer in the Forward Internet group, every day they are receiving gigabytes of data from Google and the data base infrastructure was not enough to analyze the huge data. To analyze such a huge data they found an alternative tool called Hadoop. This is the open source tool which has built on distributed analytical systems. This is based on the non relational databases. This is also called as big data analytics. Hadoop is one of the recent technologies which are used for the data analytics. Hadoop is a framework which supports the huge data that has been stored in the distributed applications. It helps to analyze the data when we have a huge peta bytes of data. Yahoo is one of the organizations which uses the hadoop technology for their business. It uses HDFS ( Hadoop distributed file system) to replicate the data and it also helps to copy the data in different racks. The aim of Hadoop is to self healing in the case of the system failures. It tries to keep the copies of the data or the nodes and tries to update that information in case of the system failures.

Data Science

Conclusion This paper gives a brief idea about the Data sciences. We took the organizations like Google, Yahoo who uses the technology data sciences and we also tried to explain the usage of the Data science with the applications like iTunes. We explained some of the functional areas like business intelligence, fraud and security, data infrastructure and product and marketing analytics and gave a brief explanation of these functional areas with respect to Data science.

References

mac slocum. (2010). data science democratized. o'reilly rader, pete warden. (2011). Why the term "data science" is flawed but useful. o'reilly rader, smith, d. (2011, may 11). revolutions. Retrieved from data http://blog.revolutionanalytics.com/2011/05/data-science-whats-in-a-name.html mikeloukides. (2010). what is data science?. o'reilly http://cdn.oreilly.com/radar/2010/06/What_is_Data_Science.pd mikeloukides. (2010). what is data science?. o'reilly rader, radar. Retrieved science:

from

Potrebbero piacerti anche