File Analysis
The File Info Window contains four tabs:
Statistics
Statistics of All Columns
Statistics of Selected Columns
If your file contains numerical data, the Statistics tab will compute basic statistical measures. This includes:
- Number Of Data Points
- Sum
- Mean
- Variance
- Standard Deviation
- Median
- Min
- Max
- 20th Percentile
- 40th Percentile
- 60th Percentile
- 80th Percentile
Note that variance and standard deviations are biased.
Buttons
You can copy part or all of the results to the clipboard using the Whole Table or Selected Values buttons.
The measures will not update automatically if the underlying data is changed. You can tell it to update the same columns that you originally had it analyze with the Same Columns button. If you want it to update new columns, you can select them and click the Selected Columns button. If you want to analyze all columns, select the All Columns button.
Column Attributes
Column Attributes (All Columns)
Column Attributes (Selected Columns)
The Column Attributes tab gives you the following information about each column:
- Number of Fields
- Number of Non-Empty Fields
- Number of Empty Fields
- Percent Non-Empty Fields
- Minimum Field Length
- Maximum Field Length
- Data Type (String, Number, or Date)
Dates can be in virtually any format and it will detect them.
Buttons
You can also copy part or all of the results to the clipboard using the Whole Table or Selected Values buttons.
Like the Statistics tab, these values will not update automatically if the underlying data is changed. You can tell it to update the same columns that you originally had it analyze with the Same Columns button. If you want it to update new columns, you can select them and click the Selected Columns button. If you want to analyze all columns, select the All Columns button.
Unique Values
Unique Values of Selected Column
Select Values in Selected Column to Filter
The Unique Values tab tells you how many times each distinct value in a column appears and the percentage of entries in the column it occupies. It also provides a histogram to graphically compare the number of appearances of the various values.
It can also be used to pick distinct values to filter. You can either filter them in (i.e. view only rows with selected values) or filter out them (i.e. view all rows except those with the selected values).
Buttons
You can copy part or all of the results to the clipboard using the Whole Table or Selected Values buttons.
Like the other two tabs, the values will not update automatically if the underlying data is changed. You can tell it to update the data with the Same Column button. If you want data on a diffent column, select it in the main window and click the Selected Column button.
To filter values, select whichever value or values you want (it doesn't have to be the value column. You can select the Count or Percent column). Click the Selected Values In or Selected Values Out button to filter the values in or out.