Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Thinking in Statistics
Thinking in Statistics
Thinking in Statistics
Ebook215 pages1 hour

Thinking in Statistics

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Objective of this book is not to present an elaborate spectrum of all the potential tools, tests, and techniques but to provide a quick and efficient way of making sense of data at hand.

This book is an attempt to help learners navigate through data and not get overwhelmed by the number of techniques and tests. Data is first step to discovery.

There is no pride to be associated with knowledge or mastery of arcane or relative obscurity of tools or tests or methods. Similarly, knowledge of the scientific discipline of data analysis does not have to become an elitist or classist discipline where few chosen can practice information apartheid with the needy.

Hence one may do well to know that data analysis is a skill to be mastered for everyday needs and essentiality must always take precedence over flashiness and empty pride.

Our day to day cognition itself is based on our inherent analysis of day to day events, moment by moment. So, everybody is using all the known and perhaps many, many unknown principles of data analysis and synthesis as the 5 senses are processed (namely vision, hearing, smell, taste and touch) along with the very process of thought formation and resulting sentiments.

Thus, everybody (any living being who is capable of sensing) has been doing a gigantic scale of data analysis, processing and synthesis already.

And each one of them can call themselves a statistician already since we were born, yet we never brag about being so with some exceptions of course...

This series of notes will cover the very basics and let the learner explore on his or her own beyond the platform of fundamental analysis.

These notes are deliberately not organized in terms of chapters, or sections but rather are meant to deliver a free form reading experience. One can start anywhere depending on the interest and need.

This is the first part covering the most basic of concepts. I look forward to publishing more very soon.

Hope you gain some useful insights from these.

Abhijit Anant Telang

LanguageEnglish
Release dateAug 17, 2018
ISBN9781386768234
Thinking in Statistics
Author

Abhijit Anant Telang

edit biographydelete Biography After traversing, discovering and learning through last 20 years of professional career, I have taken up writing to express simple, honest, candid and intuitive views. In doing so, I hope to present a unique and intellectually honest perspective in my forthcoming books. Essence of Karna's Ordeal is my first such attempt. Hope you like it. The second book is about a relatively new phenomenon that aspirants should be aware about: Understanding Psychotic Chasers: Why it is important to know who they are, what they practice and how to deal with them? My subsequent books on the sacred Indian epic of Ramayana, based on my own interpretation of Ramcharitmanas by Shri Tulasi Das, are also available.  Recently, first part of my Book Thinking in Statistics has been published.  Bhaja Govindam (This book)  is my latest write up based on composition by great Indian Saint Adi Shankaracharya who lived in 8th century India. Will be glad to get your reviews/comments. Abhijit

Read more from Abhijit Anant Telang

Related to Thinking in Statistics

Related ebooks

Mathematics For You

View More

Related articles

Reviews for Thinking in Statistics

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Thinking in Statistics - Abhijit Anant Telang

    Credits

    TO MY LOVING MOTHER, Nirmala Anant Telang, my loving wife, Preeti and in memories of my beloved late brother Jayant Anant Telang and late father Anant Govind Telang.

    -Abhijit Anant Telang

    Prologue

    OBJECTIVE OF THIS BOOK is not to present an elaborate spectrum of all the potential tools, tests, and techniques but to provide a quick and efficient way of making sense of data at hand.

    This book is an attempt to help learners navigate through data and not get overwhelmed by the number of techniques and tests. Data is first step to discovery.

    There is no pride to be associated with knowledge or mastery of arcane or relative obscurity of tools or tests or methods. Similarly, knowledge of the scientific discipline of data analysis does not have to become an elitist or classist discipline where few chosen can practice information apartheid with the needy.

    Hence one may do well to know that data analysis is a skill to be mastered for everyday needs and essentiality must always take precedence over flashiness and empty pride.

    Our day to day cognition itself is based on our inherent analysis of day to day events, moment by moment. So, everybody is using all the known and perhaps many, many unknown principles of data analysis and synthesis as the 5 senses are processed (namely vision, hearing, smell, taste and touch) along with the very process of thought formation and resulting sentiments.

    Thus, everybody (any living being who is capable of sensing) has been doing a gigantic scale of data analysis, processing and synthesis already.

    And each one of them can call themselves a statistician already since we were born, yet we never brag about being so with some exceptions of course...

    This series of notes will cover the very basics and let the learner explore on his or her own beyond the platform of fundamental analysis.

    These notes are deliberately not organized in terms of chapters, or sections but rather are meant to deliver a free form reading experience. One can start anywhere depending on the interest and need.

    This is the first part covering the most basic of concepts. I look forward to publishing more very soon.

    Hope you gain some useful insights from these.

    ABHIJIT ANANT TELANG

    Descriptive Analysis

    THIS IS ABOUT WHAT the data is centred about and how and in which ways is it trying to run away or is trying to exist separately or without conforming to what every other conforming element is doing.

    Descriptive analysis hence is about building an understanding of the phenomenon of assembly or congregation or surrounding. 

    There can be several examples.

    Gathering of crowd is a commonplace example. Whenever anyone sees a crowd, the first thought that comes to mind is where the cynosure is? Who or what is getting the attention of crowd? What is it that crowd is looking at or looking for?

    C:\Users\lenovo\Desktop\book\legos_0468f966-827d-484c-a435-fcf07a8bd0f4.002.jpeg

    This thought perhaps arises from the inherent efficiency built into our brains that tries to cut through the crap and know what is the essence? What is the most central thing that needs to be discovered so that a cognitively expensive and time-consuming exercise can be avoided or at least if not avoidable entirely, then to extract the most out such cognitive investment.  One may feel doing the same, even reading this book or any other for that matter.

    Thus, central tendency is about finding the essence, finding the centre if such indeed exist. If it exists, then perhaps it can explain a lot of aspects that everything or everybody else oriented or influenced by it.  

    Also, if such centre indeed exists, then what proportion of observations are leaning towards it or away from it?

    How significant are such observations? (considering the guiding principle of cognitive efficiency). And if indeed one can choose to ignore them, what is cost one must bear for such indifference and ignorance? How long can one continue ignoring them considering such costs?

    If one chooses not to ignore such observations, what is cost of that too, in terms of the extent to which such ignorance may cloud or thwart the perception?

    Cognitive awareness about any given phenomenon, gets built on foundation of descriptive analysis of a previously existing object or being. For a multi-dimensional data, this must be taken per dimension.

    In order to build awareness though, one must need to decide which dimension or aspect one needs to understand of a given object or person or being.

    It is cool enough, mesmerizing enough to look at the vast, blue sky as the integrated, synthesized perception that settles in mind. However, ability to enjoy a view is independent of the ability to analyse it. First comes to us automatically but the second one must be earned.

    If one really has to understand the sky, one must begin to deconstruct the synthesized perception that was formed in mind, and then explore whether it is possible to look at the composition of sky, that is the colour of it, brightness of it, the hue of it, the obstructions or aspects that cloud the view, along with various celestial objects such as the Sun, moon and the constellations that dot it. The horizon of time must also be considered as to when an object is being viewed. An inspection of composition reveals the elements or parts that constitute an integrated view. 

    Presence or absence of constituent elements needs to be considered and if present, contribution of such elements to the overall view needs to be considered.

    The view that one has formed in perception, thus needs to be analysed based on the dimensionality. Do I need to study the colour of sky? Do I need to study its brightness?  Do I need to study the distance between the celestial objects in the sky? And many more such considerations.

    So, one needs to step down from the integrated perception that is easy to enjoy to what constitutes it and then one begins to see the various objects that might be contributing to the perception formed earlier.

    Any real-world object that is perceptible, can be imagined to be one radiating with dimensionalities, like rays emanate from our Sun.  One ray of dimension can be weight. Another can be length or breadth, or height or radius, or angle of rotation and of course time.

    And hence to build awareness about any given object, one (human or a program as an autonomous one) must first decide which dimension or dimensions a given object needs to be understood and evaluated with first.

    The exact priority will depend on the purpose of evaluation. Someone may decide to study the weight of an object or colour, or height, or breadth or width. We do that when we come across other humans or animals or plants. We exclaim how tall or thick a given tree might be, or how fat or thin a given person may be, or how dark or light someone’s skin may be, how large or small someone’s eyes or face is and so on. Cognition needs to remember the appearances or feel or taste or smell of objects to form associative memories. However, judgement (and discriminative ethics of it) on outward appearance is an altogether different consideration and not a subject or focus of this book.

    Once a given dimension of importance or criticality to cognition is chosen, the visuality in cognition is about building an estimate. Here again, the compulsion to be efficient, forces cognition to know the central tendency.

    What it is all about?

    Understanding the expression of dimensionality

    TO KNOW WHAT IT IS, the behaviour of chosen dimensionality needs to be quantized to express it. This is where the cognition meets the numerical and Boolean system of representing state.

    Discreteness

    FIRST, ONE NEEDS TO consider whether an aspect or dimensionality is present or not, and this needs to be represented in Boolean form that is Yes or No or TRUE or FALSE. The indication of presence or absence is typically captured in Boolean Indicator variables.

    Gauging the representation of it’s expression

    NEXT, WHEN A PARTICULAR aspect or dimensionality is indeed present, how is that represented?

    Can the behaviour be expressed as a specific value that in turn expresses a magnitude?

    Is this value unique for each occurrence?

    If not unique for each occurrence, does it repeat? If so, which values and how often?

    Does it have a Unit of Measure?

    Is the behaviour continuous or discreet?

    Are any values missing? And what could those have been?

    If continuous, where does it begin and where does it end? Is it possible to sort stream of values into ascending or descending order?

    Does it have any specific order or precedence? For Instance, how does one occurrence compare to another one? How one would know if the two or more occurrences are indistinguishable from others? How one would know if one value is superior or inferior compared to other occurrence? And what should be the criteria for such comparison?

    Answer to these questions will lead one to consider thinking in terms of statistical measures that help measure the manifestation of dimensionality.

    Consider the following data snapshot.

    All we know at this point is that rightmost column is the output and Feature 1..4 are inputs.

    Hence, first logical step would be to study what each column of values represents.

    C:\Users\lenovo\Desktop\book\legos_0468f966-827d-484c-a435-fcf07a8bd0f4.003.png

    This tells us something about the spread and frequency of values.

    But we perhaps need to know the specific counts or frequency of occurrence.

    C:\Users\lenovo\Desktop\book\legos_0468f966-827d-484c-a435-fcf07a8bd0f4.004.png

    And then visualize this as well

    C:\Users\lenovo\Desktop\book\legos_0468f966-827d-484c-a435-fcf07a8bd0f4.005.png
    Enjoying the preview?
    Page 1 of 1