#ICYMI: R Descriptive Statistics/Summary Statistics
All the data which is gathered for any analysis is useful when it is properly represented so that it is easily understandable by everyone and helps in proper decision making. Thus after doing an analysis of data, making summary plays a vital role. This is known as summarizing the data.
We can summarize the data in several ways either by text manner or by pictorial representation.
Below are the ways of summarizing data in R:
* Descriptive/Summary Statistics – R descriptive statistics (Summary statistics) are the first figures used to represent nearly every dataset. They also form the foundation for much more complicated computations and analyses. Thus, in spite of being composed of simple methods, they are essential to the analysis process.
* Tabulation – Representing data analyzed in tabular form for easy understanding.
* Graphical – It is the way to represent data graphically.
Summary Commands in R
Whenever you start working on any data set, you need to know the overview of what you are dealing with. There are few ways of doing this:
As we have seen in the earlier session that ls() command is used to know the list of named objects that you have. So you can start by using ls command for this purpose.
Once you know the objects that are available, you can then type the name of the object to view its contents. However, is the object contains a lot of data, the display may be quite large and you many want a more concise method to examine objects.
You could use the str() command which shows you something about the structure of data rather than giving the statistical summary. It will inform you about the number of rows and columns in the data and values in the columns with their respective heads. The str() command is designed to help you examine the structure of a data object rather than providing a statistical summary.
To get a quick statistical summary of data objects, you can use summary()command.
The output of summary command depends on the object you are looking at. It gives the output as the largest value in data, the least value or mean and median and another similar type of information.
For example, if you have below data:
S.No. Item Quantity
1 Pen 5
2 Pencil 10
3 Rubber 12
Str() command gives you output describing:
1 2 3 4 5
3 obs of 2 variables Item: pen pencil rubber Quantity: 5 10 12
Summary() command gives output in below form:
1 2 3 4 5
Min: 5 Max: 12 Mean: 13.5
The summary command is, therefore, more useful as we see minimum, maximum, mean etc values. The summary() command works for both matrix and data frame objects by summarizing the columns rather than the rows.
Read full article>> https://goo.gl/85PZj5 #DataScience #Cloud
Share this:
- Click to share on Facebook (Opens in new window)
- Click to share on Twitter (Opens in new window)
- Click to email a link to a friend (Opens in new window)
- Click to share on LinkedIn (Opens in new window)
- Click to share on Tumblr (Opens in new window)
- Click to share on Pinterest (Opens in new window)
- Click to share on Reddit (Opens in new window)