Grouping data in bins

Assume we have a lot of observations, like the height and weight of every person in a group, or the time of visit for every visitor to a website.

Lots of data is hard to visualise, so we usually group the data in equal intervals (bins) and count the number of observations in each bin.

The chart below shows the heights of 25,000 eighteen year olds grouped in one inch intervals.

The workbook available here shows how to do this with a couple of simple formulas. We use COUNTIF to count the number of observations greater than a certain height, and take the difference between successive values to count the number of observations in each interval.

Hiding #DIV/0! or #N/A

An old but simple trick to hide error values.

Create a Conditional Format. Say you are in cell B2, use Conditional Formatting -> New Rule -> Use a formula to determine which cells to format -> enter the formula =ISERROR(B2) , click Format, format the text to white.

Now any error value (#N/A, #VALUE!, #REF!, #DIV/0!, #NAME?, #NULL!) in cell B2 will show as white text on a white background. Still there, but you can't see it.

Because the formula uses a relative address, you can use the format painter to copy the conditional format anywhere you want. Good for a pivot table range where calculated fields may show unwanted #DIV/0! or #N/A.

Also, you can use most Information functions to hide or show only what you want users to see. For instance ISNUMBER or ISTEXT.

 

Cascading Drop-Downs

I'm sure you all know how to set up simple cascading drop-down lists. The question is how to build a drop-down list whose values are dependent on the selected value in another drop-down list. For example, if we select a type of animal from a list, can we then show a drop-down that shows all the breeds for that type of animal, and does not show breeds for other animals?

Here is a simple example. First, I set out a list of animals, and corresponding lists of breeds for each animal. Like this;

 

Then I define some range names as shown on the sheet above. To keep things simple, I have not used dynamic ranges although this would work as well with dynamic ranges and longer lists.

Now select a cell and use Data -> Data Validation -> List and set the Source to the range name Animals.

 

The cell will now show a drop-down that lets you select one of Cats, Dogs, or Ducks.

Now select another cell. This will contain a drop-down with a list that is dependent on the selection in the first drop-down. Again use Data -> Data Validation -> List but this time set the Source to =INDIRECT($x$y) where $x$y is the absolute range of the first drop-down.

This way, if you select Dogs in the first drop-down, the second drop-down will point to INDIRECT("Dogs") i.e. the range name Dogs.

 

I have included a sample workbook here. Note that this includes a little bit of VBA code to make things work a bit nicer.

 

Let's make this more interesting...

Larger amounts of data should not require you to


This post shows one way to do this without using VBA.​

First I start with a simple list of States and Cities.

This list is in an Excel sheet, but it could easily come from a database, or a separate csv file.

Then I create two pivot tables from the above data. The first just shows each State and the count of Cities within the State.

 The second pivot table adds City as a row label. Again we just count the Cities, and turn off row totals.

Now we define a few range names;

State just points to the list of States in the first Pivot Table. We can use the INDEX function to make this a dynamic range =Sheet1!$J$5:INDEX(Sheet1!$J:$J,COUNTA(Sheet1!$J:$J))

StateCity points to the first column of the second Pivot Table. Again, we use a dynamic range to define this =Sheet1!$M$5:INDEX(Sheet1!$M:$M,COUNTA(Sheet1!$N:$N)+3)

CityHeader uses an OFFSET function to point to the header in the second Pivot Table =OFFSET(StateCity,-1,1,1,1)

StateNo matches the value in the cell to the left of its own range to the list of States in the first Pivot Table and returns an index number. =MATCH(Sheet1!D18,State,0) Note that this uses a relative reference, so entering =StateNo in a cell will look up the value of the cell to the left and return a number if that cell contains a valid State.

StateCityNo matches the value of the cell to the left to the second Pivot Table. =MATCH(Sheet1!D19,StateCity,0)

StateCityNext returns the index no for the next State in the second Pivot Table. =MATCH(INDEX(State,StateNo+1),StateCity,0)

City uses the range names thus defined to return the list of cities for the selected State based on the second Pivot Table. =OFFSET(CityHeader,StateCityNo,0,StateCityNext-StateCityNo,1)

With that done, we only need to define a couple of drop-down lists using Data Validation.

The first has its Source as =State. The second drop-down must be in the cell immediately to the right of the first drop-down, and its Source is set to =City.

This is probably clearer in the workbook that can be downloaded here.