Sei sulla pagina 1di 72

Formula Methods in Excel

Optimising calculations in Excel workbooks


This Excel formula manual is suitable for Excel users of all levels. Rather than just focus on individual functions and formula methods, this course takes a deeper look at how Excel evaluates formulae, and focuses on the most efficient methods available. Jon von der Heyden 3/23/2011 P

System Requirements
At the time of writing the latest version of Microsoft Excel for Windows is office version 14, Excel 2010. This document is written specifically for office versions for Windows PC. Unless otherwise stated all functions and methods are supported in Excel 2003, 2007 and 2010.

Formula Methods in Excel Jon von der Heyden 2011

Page 1

About Excel Design Solutions


There has been much debate amongst some of the professionals that frequent the Excel forums on what makes a true Excel modeller/developer. Some suggest that having a thorough knowledge of Excels rich features and functionality is unnecessary, favouring business skills and experience. Some even suggest having little or no VBA programming experience is ok too. Some, on the other hand, have suggested that all one requires is the technical skills and experience, and that it is down to the client to communicate the requirements. We at Excel Design Solutions believe that a true Excel professional modeller/developer must have exceptional technical Excel knowledge and have exceptional business acumen. That is why you will find that each of our consultants participate at the various Excel web forums on an endless quest to improve our knowledge by addressing other users and developers challenges. Each of our consultants have worked in business for many years and established themselves as business experts in their chosen fields. In fact, forum participation and a back-bone in business is a requirement to any individual seeking opportunities within Excel Design Solutions. We dont have a large employee base. Whilst we do work directly on projects we do also seek and approach known Excel and business experts to collaborate in our assignments on a per project basis. Being so directly involved in the forums and the Excel community we have established relationships with the best in the field and we collaborate with these individuals on an as-need basis. For more information on what Excel Design Solutions can do for you, or to get in touch with someone at Excel Design Solutions, visit the website: www.exceldesignsolutions.com

Formula Methods in Excel Jon von der Heyden 2011

Page 2

About Jon von der Heyden (The Author)


Jon is one of the co-founders of Excel Design Solutions, founded in 2007. He has over ten years experience in finance analysis and commercial management positions. His speciality is management accounting and he relishes complex financial modelling assignments. Jon initially pursued a career in IT, having studied webdesign and E-commerce, but was later nudged toward finance when working for a large UK telecoms company back in 2000. Although not a qualified management accountant, this subject interests Jon most and he has spent much time tutoring many CIMA graduates by teaching the practical applications of the many management accounting methodologies using Excel. Jon has spent much of his years working on reorganisation projects as a senior analyst. He specialises in cost analysis, activity-based costing and cost improvement. Achieving cost improvement has often lead Jon into the various business operations giving Jon valuable insight into the business functions. Process improvement and automation has been the key to Jons successes. Jon has also been involved in plenty of other projects including outsource, supply chain management and revenue generating projects. Jons most recent experience as a company employee was working in shared services for an international multi-conglomerate where he acquired 5 years international experience controlling cost opportunity projects and playing an integral role in the implementation of the shared services global product catalogue and efficiencies in service delivery and financial planning.

Formula Methods in Excel Jon von der Heyden 2011

Page 3

Table of Contents
System Requirements........................................................................................................................................ 1 About Excel Design Solutions ............................................................................................................................ 2 About Jon von der Heyden (The Author)........................................................................................................... 3 Index of Tables................................................................................................................................................... 8 Introduction ..................................................................................................................................................... 10 1. Back to Basics .......................................................................................................................................... 11 Basic Anatomy of an Expression .................................................................................................................. 11 Translating an Expression into an Excel Formula ........................................................................................ 11 Statistical Notations and Worksheet Functions .......................................................................................... 12 (Capital) Sigma, .................................................................................................................................... 12 SUM and SUMPRODUCT.......................................................................................................................... 12 X bar, .................................................................................................................................................... 13

AVERAGE.................................................................................................................................................. 13 Introduction to Excel Formula ..................................................................................................................... 14 Basic Anatomy of an Excel Formula......................................................................................................... 14 Operators................................................................................................................................................. 14 Calculation Order and Operator Precedence .......................................................................................... 16 Cell Referencing ....................................................................................................................................... 16 3-D References ........................................................................................................................................ 17 Union References .................................................................................................................................... 17 Intersecting Ranges ................................................................................................................................. 17 Reference Notation ................................................................................................................................. 18 Defined Names ........................................................................................................................................ 18 Array (CSE) formulae ............................................................................................................................... 19 Array Constants ....................................................................................................................................... 21 2. How the Excel Recalculation Engine Works ............................................................................................ 22 Dependency Trees ....................................................................................................................................... 22 Volatile Functions ........................................................................................................................................ 23 Events that Trigger Recalculation ................................................................................................................ 24 Calculation Methods.................................................................................................................................... 24 3. Data Types, Interpretation and Precision ................................................................................................ 25 Data Types ................................................................................................................................................... 25 Numbers .................................................................................................................................................. 25 Booleans .................................................................................................................................................. 25 Formula Methods in Excel Jon von der Heyden 2011 Page 4

Errors ....................................................................................................................................................... 25 Text .......................................................................................................................................................... 26 Floating Point-Precision ............................................................................................................................... 26 Loss of Precision When Using Very Large Numbers ................................................................................ 26 Loss of Precision When Using Very Small Numbers ................................................................................ 27 Boolean Logic............................................................................................................................................... 27 Coercion................................................................................................................................................... 27 AND Logic................................................................................................................................................. 28 OR Logic ................................................................................................................................................... 28 Date and Time Values .................................................................................................................................. 29 4. Introducing Worksheet Functions ........................................................................................................... 30 Data Type Conformity.................................................................................................................................. 30 Nested Worksheet Functions ...................................................................................................................... 33 Optional Arguments .................................................................................................................................... 33 Logical and Information Functionsookup Functionsurther Lookup Tips ................................................................................................................................. 38 Binary Search versus Linear Search ......................................................................................................... 39 Math and Statistical Functionsormula Methods in Excel Jon von der Heyden 2011 Page 5

ext Functionsate Functionsatabase Functions ..................................................................................................................................... 52 DSUM() .................................................................................................................................................... 53 DAVERAGE() ............................................................................................................................................. 53 Formula Methods in Excel Jon von der Heyden 2011 Page 6

atabase Function Examples ................................................................................................................... 54 5. Dynamic Named Ranges .......................................................................................................................... 56 When to Use Dynamic Named Ranges ........................................................................................................ 56 One-Dimensional Dynamic Range ............................................................................................................... 56 Dynamic Ranges Numbers Only ........................................................................................................... 57 Dynamic Ranges Text Only ................................................................................................................... 57 Multi-Dimensional Dynamic Ranges............................................................................................................ 57 6. 7. 8. Using Tables ............................................................................................................................................. 58 Auditing Formula ..................................................................................................................................... 59 Funky formulae ........................................................................................................................................ 62 Get the Month Number of a Financial Year ............................................................................................ 62 Get the Week Number of a Financial Year .............................................................................................. 62 Repeat Each Item in a Table n Times ....................................................................................................... 62 Repeat a Table n Times............................................................................................................................ 63 Get the nth Element from a String based on a given Delimiter ............................................................... 63 3-Dimensional SUMIF .............................................................................................................................. 63 Multi-Criteria Lookups ............................................................................................................................. 63 Vlookup returning Multiple Results......................................................................................................... 64 Variable Discounting using Differential Rates ......................................................................................... 64 Extract Numbers from an Alpha-numeric String ..................................................................................... 64 Extract a Date from a Text String ............................................................................................................ 64 Calculate the Last Used Row in a Column (useful for Dynamic Ranges) ................................................. 65 Locate a Break-Even Point ....................................................................................................................... 65 9. Shortcuts.................................................................................................................................................. 66 Control Keys................................................................................................................................................. 66 Function Keys............................................................................................................................................... 68 10. Limitations Table ................................................................................................................................. 70

Formula Methods in Excel Jon von der Heyden 2011

Page 7

Index of Tables
Table 1-1 Summing the X and Y values separately .......................................................................................... 12 Table 1-2 Summing the XY products ............................................................................................................... 12 Table 1-3 Summing the X and Y values separately using SUM ........................................................................ 13 Table 1-4 Summing the XY products using SUMPRODUCT ............................................................................. 13 Table 1-5 Arithmetic Operators....................................................................................................................... 14 Table 1-6 Comparison Operators .................................................................................................................... 15 Table 1-7 Text Operators ................................................................................................................................. 15 Table 1-8 Reference Operators ....................................................................................................................... 15 Table 1-9 Wildcard Operators ......................................................................................................................... 15 Table 1-10 Operator Precedence .................................................................................................................... 16 Table 1-11 Using parenthesis to change calculation order ............................................................................. 16 Table 1-12 Aggregating unioned references ................................................................................................... 17 Table 1-13 Aggregating intersecting references ............................................................................................. 17 Table 1-14 R1C1 Notation ............................................................................................................................... 18 Table 1-15 Demonstrating name scope recognition ....................................................................................... 19 Table 1-16 Aggregating an Inline Array Constant............................................................................................ 19 Table 1-17 Aggregating an Array ..................................................................................................................... 20 Table 1-18 An Array Entered Formula ............................................................................................................. 20 Table 2-1 List of Strictly Volatile Functions ..................................................................................................... 23 Table 2-2 Recalculation Event Triggers ........................................................................................................... 24 Table 3-1 List of error types............................................................................................................................. 26 Table 3-2 Example loss of precision when using very large numbers ............................................................. 26 Table 3-3 Example loss of precision when using very small numbers............................................................. 27 Table 3-4 Coercing boolean values to digital values ....................................................................................... 27 Table 3-5 Coercing an array of boolean values to an array of digital values .................................................. 28 Table 3-6 AND Logic Truth Table ..................................................................................................................... 28 Table 3-7 OR Logic Truth Table........................................................................................................................ 28 Table 4-1 Basic anatomy of a worksheet function .......................................................................................... 30 Table 4-2 Demonstrating the distinct advantage of using SUM over a classic addition expression ............... 31 Table 4-3 VLOOKUP, exact match and approximate match syntax................................................................. 32 Table 4-4 Demonstrating nested worksheet functions within a formula ....................................................... 33 Table 4-5 Boolean logic, multiplying logical tests to avoid function calls and evaluation steps. .................... 35 Table 4-6 Performing a right-to-left lookup with INEX and MATCH ............................................................... 37 Table 4-7 Yielding an intersecting range using INDEX ..................................................................................... 37 Table 4-8 Yielding a range using INDEX to return a range operand ................................................................ 38 Table 4-9 Handling lookup error values .......................................................................................................... 39 Table 4-10 Rounding to the nearest desired multiple using ROUND .............................................................. 40 Table 4-11 Rounding up to the nearest desired multiple using CEILING ........................................................ 40 Table 4-12 Extracting a date from a date and time stamp .............................................................................. 41 Table 4-13 Extracting the time from a date and time stamp .......................................................................... 41 Table 4-14 Summing the nth item in an array using MOD; a stepped approach ............................................ 42 Table 4-15 Using MIN and MAX to avoid IF function calls .............................................................................. 42 Table 4-16 Summing the top n values in an array using SUM and LARGE ...................................................... 43 Table 4-17 Sum or Count a range using multiple criteria with SUMPRODUCT ............................................... 44 Table 4-18 Identifying duplicates in a range of values using COUNTIF ........................................................... 45 Formula Methods in Excel Jon von der Heyden 2011 Page 8

Table 4-19 Sum values in a range based on multiple criteria in the same criteria range ............................... 45 Table 4-20 Summing values that correspond to empty cells using SUMIF ..................................................... 46 Table 4-21 Summing cells that correspond to non-empty cells using SUMIF ................................................. 47 Table 4-22 Sum values between two dates using SUMIF ................................................................................ 47 Table 4-23 Offsetting the sum range in SUMIF ............................................................................................... 47 Table 4-24 Dropping leading characters with MID and REPLACE.................................................................... 49 Table 4-25 Return a serial date exactly n months before or after a specified date ........................................ 51 Table 4-26 Return the 1st and last day of the month of a given date.............................................................. 51 Table 4-27 DATEDIF interval values ................................................................................................................. 52 Table 4-28 Aggregating results with D Functions with a single criterion ........................................................ 54 Table 4-29 Aggregating results with D Functions using multiple criteria (OR logic) ....................................... 54 Table 4-30 Aggregating results with D Functions using multiple criteria (AND logic)..................................... 55 Table 5-1 Dynamic Table of Holiday Dates ...................................................................................................... 56 Table 6-1 Table reference syntax .................................................................................................................... 58

Formula Methods in Excel Jon von der Heyden 2011

Page 9

Introduction
This material really is intended for anybody. Even the more advanced users are unlikely to know 60% of this material. The only two mandatory criteria in candidates are: He or she must want to learn Excel. He or she must really want to learn Excel! This material focuses on formulae methods exclusively. Why? Because this is where 90% (or more) of models go wrong! formulae are probably the single most powerful feature Excel offers and on which outputs are most heavily dependent on. And lets face itExcel is huge! You could spend 2 hours a day studying Excel for a year and you still wont scratch the surface. All studying Excel has ever done for me is reveal how much more there is to explore, and give me a hunger to learn more. This material starts with a gentle stroll as we explore some of the basics of formulae and understand how Excel interprets formulae and computes the results. By the end we will be exploring complex expressions, nesting functions, using array formulae, names, dynamic ranges, tables and all sorts of other exciting stuff! For now, let us just assume EXCEL CAN DO ANYTHING (except make toast!).

Formula Methods in Excel Jon von der Heyden 2011

Page 10

1. Back to Basics
Let us start by asking, what is a formula? A formula, in Excel, is an expression entered into a range or name that is recognised by Excel such that it can be processed by its calculation engine to produce a result.

Basic Anatomy of an Expression


Example: 3X2 - 4X + 5XY + 3X Term: There are four terms in the given expression. They are, respectively, 3X2; -4X; 5XY; 3X Sign: The sign of a term is whether it is positive or negative. Only the second of these four terms is negative. When we write a positive term on its own we dont bother to write the + sign before it. Term Type: This refers only to the part of the term that is written in letters. Thus, the first term of this expression is an X-squared term, the second term is an X term, the third an XY term and the fourth and last is an X term. Coefficients: The coefficient of a term is the number at the front of it. The coefficient tells us how many of each term type there are. Like Term: When term types are the same they are known to be like terms. In this example -4X and 3X are like terms. The phrase collecting like terms refers to the process of putting like terms together into a single term. For example, collecting -4X and 3X can be represented in a single term -X (note the exclusion of the coefficient 1, which is always assumed to be 1 when omitted).

Translating an Expression into an Excel Formula


Example: = 3*A1^2 4*A1 + 5*A1*B1 + 3*A1 In this example we have substituted the letter X for reference A1, and the letter Y for cell reference B1. The only way to tell Excel that an entry in a cell is an expression, and that it is to be passed to its calculation engine for processing, is to prefix the expression with an equals symbol or unary symbol. The former is more commonly used and recommended. Excel demands that we be much more explicit when describing an expression. For instance, we know from the previous example that the term 3X2 means that there are 3 X-squared terms. In Excel, we need to explicitly multiple the term three times, hence 3*X2.

Formula Methods in Excel Jon von der Heyden 2011

Page 11

Statistical Notations and Worksheet Functions


The use of the term notation in the following context needs clarification. In statistics, notations might refer to symbols used to represent an instruction on how to process a term. Let us explore two common notations used in statistics. (Capital) Sigma, The first most common symbol in expressions is the Greek letter capital sigma, written as . This is not to be confused with the lower case Greek letter sigma , which is used to measure spread, called the standard deviation. The sigma we refer to, , is an instruction to add a set of numbers together. So, X means to add together all of the X values. Similarly, XY means add together all of the XY products. For example: X 0 1 2 3 4 X = 0 + 1 + 2 +3 + 4 = 10
Table 1-1 Summing the X and Y values separately

Y -4 1 1 3 2 Y = -4 + 1 + 1 + 3 + 2 =3

To find XY, it is necessary to calculate all of the five separate products of X times Y and then add them together, thus; X 0 1 2 3 4 Y -4 1 1 3 2 XY 0 1 2 9 8 XY = 0 + 1 + 2 + 9 + 8 = 20

Table 1-2 Summing the XY products

SUM and SUMPRODUCT The notations used in expressions are not available to us in Excel formula, that is, Excel does cannot interpret these symbols and the anatomy of these expressions. Instead, we pass instruction to Excel using Worksheet Functions.

Formula Methods in Excel Jon von der Heyden 2011

Page 12

The instruction X, meaning add together all of the X values, is passed to Excel using the SUM worksheet function. For example: A 1 2 3 4 5 6 7 8 X 0 1 2 3 4 X = SUM(A2:A6) = 10 Y -4 1 1 3 2 Y = SUM(B2:B6) =3 B

Table 1-3 Summing the X and Y values separately using SUM

The instruction XY, meaning add together all of the XY products, is passed to Excel using the SUMPRODUCT worksheet function, thus; A 1 2 3 4 5 6 7 8 X 0 1 2 3 4 Y -4 1 1 3 2 B C XY 0 1 2 9 8 XY = SUMPRODUCT(A2:A6,B2:B6) = 20

Table 1-4 Summing the XY products using SUMPRODUCT

Note that we do not need to make any reference to column C. X bar, Perhaps the second most common symbol in expressions is the X bar, represented by the symbol . This refers to the mean of the X values. A mean is the most common form of average, where one adds up the X values and divide it by the count of the X values; thus can also represented by the following expression:

=
AVERAGE Again, Excel is not able to interpret the X bar symbol in an expression. Instead we need to pass the instruction to Excel using the AVERAGE worksheet function. Using the preceding examples, the instruction to calculate the mean of the X values can be passed using the following expression: =AVERAGE(A2:A6)

Formula Methods in Excel Jon von der Heyden 2011

Page 13

Introduction to Excel Formula


In the previous chapter we looked at expressions and how one would translate these expressions into syntax that Excel can interpret. We also introduced a few worksheet functions. Let us now explore the anatomy of a typical Excel formula, with an embedded worksheet function, using the appropriate Excel terminology. Basic Anatomy of an Excel Formula A formula can contain any or all of the following: [worksheet] functions, references, operators and constants. Example: = ROUND(A1+A2,2) = ROUND(TotalSales,2) Function: ROUND is a function used to round a number to n decimal points, in this example 2. References: References include cell addresses and names. In the given formula A1, A2 and TotalSales are all examples of references, with the latter being a name. Operators: There are five categories of operators; arithmetic, comparison, text concatenation and reference. In the given formula the + (plus) is an example of an arithmetic operator, and the 2nd = (equals) is an example of a comparison operator. Constants: A constant is a value that is not calculated. Any value resulting from an expression is not a constant. In the given formula the #2 is an example of a constant.

Operators Operators specify the type of calculation that you want to perform on the elements of a formula. There is a default order in which calculations occur, generally following mathematical rules, but that can be changed using parenthesis. ARITHMETIC OPERATOR + (plus) - (minus) * (asterisk) / (forward slash) % (percent) ^ (caret)
Table 1-5 Arithmetic Operators

MEANING Addition Subtraction Multiplication Division Percent Exponentiation

EXAMPLE = 3+3 = 5-4 = 10*10 = 10/2 = 50% = 2^2

Arithmetic operators always yield a numeric data type result.

Formula Methods in Excel Jon von der Heyden 2011

Page 14

COMPARISON OPERATOR = > < >= <= <>


Table 1-6 Comparison Operators

MEANING Equal to Greater than Less than Greater than or equal to Less than or equal to Not equal to

EXAMPLE = A1=B1 = A1>B1 = A1<B1 = A1>=B1 = A1<=B1 = A1<>B1

Comparison operators always yields a logical data type result (i.e. TRUE or FALSE).

TEXT OPERATOR & (ampersand)


Table 1-7 Text Operators

MEANING Concatenates two operands

EXAMPLE = A1 & B1

The text operator always yields a string data type result, even if the operands are numerical values.

REFERENCE OPERATOR : (colon) , (comma) (space)


Table 1-8 Reference Operators

MEANING Range operator, producing a single reference of all cells contained within each given reference. Union operator, combining multiple references into a single reference. Intersection operator, producing a reference of cells common to two given references.

EXAMPLE = A1:B10 = A1:A10,C1:C10 = B7:D7 C6:C8

Reference operators always yield a range data type result, specifically a range object.

WILDCARD OPERATOR * (asterisk) ? (question mark) ~ (tilde)


Table 1-9 Wildcard Operators

MEANING Matches any number of characters. Matches any single character. Matches the literal trailing character.

EXAMPLE =COUNTIF(A1,*XYZ*) =COUNTIF(A1,? & XYZ) =COUNTIF(A1,~*)

Wildcard operators are used in text comparison statements.

Formula Methods in Excel Jon von der Heyden 2011

Page 15

Calculation Order and Operator Precedence It probably comes as no surprise to learn that Excel calculates formulae in a very specific order. A formula in Excel always begins with an equal sign (=). Following the equal sign are the elements (operands) to be calculated, such as constants or references. These are separated by calculation operators. Excel calculates the formula from left to right, according to a specific order for each operator in the formula. RANK 1 OPERATOR : (colon) (space) , (comma) % ^ * and / + and & = <> <= >= <> DESCRIPTION Reference operators

2 3 4 5 6 7 8

Negation (e.g. -1) Percent Exponentiation Multiplication and division Addition and Subtraction Concatenation Comparison

Table 1-10 Operator Precedence

To change the order of calculation, enclose the part of the formula to be calculated first in parenthesis. EXAMPLE EXPRESSION = 5+5*2 = 5+(5*2) = 5+10 = 15
Table 1-11 Using parenthesis to change calculation order

= (5+5)*2 = 10*2 =20

Cell Referencing Relative References: A relative reference in a formula, such as A1, is based on the relative position of the cell that contains the formula and cell that the reference refers to. If the cell position of the formula changes then the cell referenced by the formula will change relatively too. Absolute References: An absolute reference in a formula, such as $A$1, always refers to a cell in a specific location. If the cell position of the formula changes then the cell referenced by the formula will not change. In A1 notation column and row references are flagged as absolute by prefixing the column and row with the $ (dollar) symbol, also referred to as an anchor. Mixed References: A mixed reference has either an absolute column and a relative row, or an absolute row and a relative column. What this means, essentially, is that either only a column or a row is anchored. $A1 tells us that the column reference, A, will not change when this formula cell changes in position. The row reference however will change relative to the position. Conversely, A$1, tells us that the row reference, 1, will not change when this formula cell changes in position. The column, however, has not been anchored and will change relatively. Formula Methods in Excel Jon von der Heyden 2011 Page 16

3-D References Harnessing multiple sheets in your calculations can be used in such a manner that they introduce to us a 3rd dimension. Use a 3-D reference if you wish to analyse the same cell, or range of cells, on multiple worksheets in a workbook. Example: =SUM(Sheet1:Sheet5!A1:A10) In this example values housed in A1:A10, within all sheets positioned between and including Sheet1 and Sheet 5, are summed up to yield a result. Union References Union references, i.e. cell references separated with the comma (,) separator, allow us to create references to non-contiguous ranges. A 1 2 3 =SUM(A1,B2,C3) B 2 3 4 3 4 5 C

1 2 3 4

Table 1-12 Aggregating unioned references

Intersecting Ranges You can aggregate values from an intersection of two range references. In other words, only the components that falls within both range references is taken into account. A 1 2 3 4 5 6 Sales COGS Gross Margin =NWE COGS NWE 2800 1100 1700 B SWE 1400 750 650 C NEE 1800 950 850 D

Table 1-13 Aggregating intersecting references

Formula Methods in Excel Jon von der Heyden 2011

Page 17

Reference Notation In Excel, references conform to one of two notations, namely A1 reference style or R1C1 reference style. The former is the default but either is acceptable. A1 Notation: In A1 reference style columns are represented by letters A:IV (Excel 2003 and earlier versions) or A:XFD (Excel 2007 and subsequent versions). Rows are numbered . In this reference style columns and rows are anchored by suffixing the column or row reference with a $ (dollar symbol). R1C1 Notation: In R1C1 reference style both columns and rows are numbered. Cell references are displayed in terms of their relationship to the cell that contains the formula rather than their actual position on the grid. Cells are referred to by relative notation. Relative references have numbers in square brackets. REFERENCE R[-2]C RC[-2] R[2]C[2] R2C2 R[-1] C[-1] R C RC MEANING A mixed reference to the cell two rows up and in the same column. A mixed reference to the cell in the same row and two columns to the left. A relative reference to the cell two rows down and two columns to the right. An absolute reference to a cell in the 2nd row and 2nd column (i.e. B2). A relative reference to the entire row above the active cell. A relative reference to the entire column to the left of the active cell. An absolute reference to the current row. An absolute reference to the current column. An absolute reference to the active cell.

Table 1-14 R1C1 Notation

Defined Names You can create names to represent cells, ranges of cells, formulae, constants, array constants or Excel tables. A name is a meaningful shorthand that makes it easier to understand the purpose of a reference in a formula. When to use names: To represent cells, or ranges of cells, that will be frequently referenced in formulae, pivot tables and charts. To house constants that will be frequently referenced in formulae. To facilitate dynamic range references to be used in formulae, pivot tables and charts. Dynamic ranges are generated using formulae.

All names have a scope, either to a specific worksheet (referred to as local scope) or to the entire workbook (referred to as global scope). The scope of a name is the location within which the name is recognised without qualification. For example, if you have a name such as Budget_FY11, and its scope is Sheet1, that name, if not qualified, is recognised only in Sheet, but not in other sheets without qualification.

Formula Methods in Excel Jon von der Heyden 2011

Page 18

NAME Test Test FORMULA = Test = Sheet1!Test = Test = Sheet1!Test

REFERS TO =Sheet =Workbook LOCATION Sheet1 Sheet1 Sheet2 Sheet2

SCOPE Sheet1 Workbook RESULT Sheet Sheet Workbook Sheet

Table 1-15 Demonstrating name scope recognition

Array (CSE) formulae An array formula can perform multiple calculations and then return either a single result or multiple results. Array formulae act on two or more sets of data known as array arguments. One creates array formulae in the same manner in which one produces normal formula, but the instruction to process the formula as an array formula is given by confirming the formula entry with Control+Shift+Enter. If done properly Excel encapsulates the formula in curly brackets {}. Do not attempt to manually type in the curly brackets. This form of formula is also commonly referred to as CSE formula because of the need to commit it with Control+Shift+Enter. The first type of array formula, i.e. the ones used to yield a single result, offers us endless possibilities, but unfortunately they are also known to add significant overhead to the calculation process. This is not always true, and in fact array formulae have received bad publicity, as in some manners of use actually can reduce the overhead in the calculation process. Best practise suggest that we use array formula in moderation and consider adopting a stepped approach as an alternative (i.e. using helper cells, columns and rows). But for the budding formula guru, I suggest experimenting with both array formula and classic methods using stepped approach and then note the changes in calculation times and draw your own conclusions on when it is acceptable, or not, to use array formulae. Sometimes practicality must prevail over efficiency, provided that the methods used are not grossly inefficient. When we create a single result array formula we pass it an array of variable values or an array of constant values. The array on its own serves little purpose. Instead we have to pass an instruction to Excel on how to aggregate the array, typically using SUM, AVERAGE or COUNT. FORMULA {={1;2;3;4;5;6;7;8;9;10}} RESULT 1 COMMENT If you were to enter this formula in cell A1, and commit with CSE, Excel will yield a result of 1 (the first array item). To aggregate a result one must pass an instruction to Excel telling it what form of aggregation to apply to the items in the array. Here the result is 55 because Excel has received an instruction to SUM each item in the array.

{=SUM({1;2;3;4;5;6;7;8;9;10)} 55
Table 1-16 Aggregating an Inline Array Constant

Formula Methods in Excel Jon von der Heyden 2011

Page 19

FORMULA {=ROW(1:10)}

RESULT 1

{=SUM(ROW(1:10))}
Table 1-17 Aggregating an Array

55

COMMENT In this example Excel is told to yield an array of values associated with the given row numbers. Again this is rather pointless, unless the array is used for some form of aggregation. Excel yields a result of 55, the SUM of each item in the array.

The exhibit in table 1.16 demonstrates the syntax of an inline array constant array formula. When passing inline array constants Excel automatically recognises that it should treat the formula as an array formula. Therefore it is not necessary to explicitly pass instruction to Excel using CSE. Thus; =SUM({1;2;3;4;5;6;7;8;9;10}) will yield the same result as; {=SUM({1;2;3;4;5;6;7;8;9;10)} The exhibit in table 1.17 demonstrates the syntax of an array formula calling a variable array. This form of an array formula does require that we explicitly pass Excel an instruction to treat the formula as an array formula. However, the SUMPRODUCT function aggregates its results using array formula method and thus we are not explicitly required to instruct Excel to treat SUMPRODUCT like an array formula. When passing a single array of values to SUMPRODUCT, SUMPRODUCT can only yield a summation of those values. Thus; {=SUM(ROW(1:10))} will yield the same result as; SUMPRODUCT(ROW(1:10)) The use of SUMPRODUCT in this context is recommended because it avoids someone inadvertently recommitting the formula without CSE. The LOOKUP and FREQUENCY function are also capable of processing arrays without CSE. An exception to this is when the TRANSPOSE function is used within an array formula argument. The latter form of an array formula mentioned is the type that yields multiple results. This form is commonly referred to as an array entered formula. A typical example would be to explore the TRANSPOSE worksheet function. TRANSPOSE is used to copy an array of values and yield a result of opposite orientation or dimension. A 1 2 3 4 5 X X Y Z Y B Z C

{=TRANSPOSE(A1:C1)}

Table 1-18 An Array Entered Formula

In this example one would first select range A3:A5, then type the formula, and then commit with CSE. It is not necessary to anchor any of the references as none will move relatively. Excel knows to handle the range as an array of values. There are two effects of an array entered formula that one need be aware of: Formula Methods in Excel Jon von der Heyden 2011 Page 20

1. One cannot change a single element of the array (in this example A3:A5). The array needs to be handled as a single entity, thus if changes are required one needs to select the entire range, enter the revised formula, and commit with CSE. 2. As a result of (1) above, one cannot delete a row or column that intersects an array entered formula range. In the above example one could delete column A because the entire array range is contained within that column. One cannot however delete row 3, 4 or 5 because each intersects with the array entered formula range. Deleting all rows 3:5 (in one hit) is permissible for the same reason that one can delete column A. Array Constants Array constants, that have had brief mention in the section above, are merely arrays that remain constant. Array constants can contain text, numbers, logical values or error values. Numbers, logical values and errors can be typed in as is. Text values must be enclosed in speech marks. When you enter array constants make sure you: 1. Enclose them in curly brackets {}. 2. Denote column partitions with a comma (,). 3. Denote row partitions with a semi-colon (;). Example: {1,2;3,4} This example demonstrates an array comprising of two rows and two columns. Array constants can be entered in names or directly within formula. When entered directly into a formula they are referred to as inline array constants. Inline arrays and names arrays need to be treated as two separate animals: NAMED ARRAY (not CSE entered) =SUM(myarray) =8 =SUM(myarray)+1 =9 =SUM(myarray+1) =4 NAMED ARRAY (CSE entered) {=SUM(myarray)} =8 {=SUM(myarray)+1} =9 {=SUM(myarray+1)} =10 INLINE ARRAY CONSTANT =SUM({3;5}) =8 =SUM({3;5})+1 =9 =SUM({3;5}+1) =10

One is not required to CSE commit an array formula with an inline array constant, it is a given. But one must be cautious when referring to named arrays because the behaviour does not appear to be consistent. On first review it appears as though it is not necessary to CSE commit formula with named array references. However, look at the 3rd exhibit under NAMED ARRAY (not CSE entered). This rendition does need to be CSE committed. Of course in this example the entire issue can be overcome by using SUMPRODUCT, but thats not the point. The same issue would apply using other aggregate functions, such as AVERAGE. The recommendation here is, when in doubt use CSE to commit the formula.

Formula Methods in Excel Jon von der Heyden 2011

Page 21

2. How the Excel Recalculation Engine Works


Excel uses a complex algorithm for choosing the fastest route and the minimum number of cells required to calculate a formula result. Excels recalculation engine normally optimises calculation time by tracking changes and only recalculating: Cells, formula, values or names that have changed since the last calculation. Cells dependent on other cells, formulae, names or values that need recalculation.

The exceptions to the statements above are: Volatile functions are always calculated. Full calculation (Control+Alt+F9) will force calculation of all formulae. Having more than 65536 dependencies causes full calculation to be invoked. Names that are not called anywhere in a worksheet are never calculated. Names are calculated each time they are referenced by a formula that is recalculated.

Dependency Trees
Excel tracks changes since the last recalculation and builds dependency trees in an attempt to reduce calculation time. These prompt Excel to recalculate only: Formulae that have changed. Names that have changed. Volatile functions. Formulae dependent on changed or volatile formulae, names or cells.

Dependency trees are immediately updated whenever a formula is entered or changed. In Excel 2002 and later you can force Excel to rebuild the dependency trees by hitting Control+Alt+Shift+F9. In complex formula-based models, Excel may spend considerable time and memory building and evaluating the dependency trees. In versions prior to Excel 2007 dependency trees will only store up to 65536 dependencies to unique references. Where complex formula-based models near that limit it is not unusual to find full calculation faster than recalculation.

Formula Methods in Excel Jon von der Heyden 2011

Page 22

How do you know when you are exceeding the dependency tree limit? The word calculate persists in the status bar despite invoking recalculation. Note, calculate will also display in the status bar when: o Calculation option has been set to manual and the workbook contains uncalculated formulae. o The iteration option is turned on and the workbook contains circular references. o You are using Excel 2007 or later and have set Workbook ForceFullCalculation to True. Changing a cell and tabbing to another cell takes a long time.

Dependency trees are categorised as follows: Within Sheet Dependency Trees Inter Sheet Dependency Trees Inter Workbook Dependency Trees

Formulae with references to other sheets are known to take longer to calculate. formulae with references to other workbooks are also known to take longer to calculate, sometimes quite significantly. One should always consider strongly whether or not to link to other workbooks, and perhaps favour storing the external data directly within the same workbook (e.g. by using a query table).

Volatile Functions
A volatile function is a worksheet function that Excel has determined must be recalculated at each recalculation, regardless of whether or not any of its precedents have changed. A function is not always strictly volatile or non-volatile. Some functions behave in a volatile manner depending on the manner in which it is used. There are however a number of functions that are strictly volatile, namely: FUNCTION RAND NOW TODAY OFFSET CELL INDIRECT INFO COMMENT Generates a new random number each time recalculation is invoked. Returns the current date and time (from the system date and time) each time recalculation is invoked. Returns the current date (from the system date) each time recalculation is invoked. Returns a reference offset from a given reference. Returns information about the formatting, location, or contents of a cell. Returns a reference indicated by a text value. Returns information about the current operating environment.

Table 2-1 List of Strictly Volatile Functions

The SUMIF function can also behave in a volatile manner depending on the manner in which it is used. VOLATILE =SUMIF(A1:A10,<>0,B1) NON-VOLATILE =SUMIF(A1:A10,<>0,B1:B10)

Formula Methods in Excel Jon von der Heyden 2011

Page 23

The differences between the two formulae referred to might not be so obvious. The volatile method does not explicitly reference the column B range, whilst the non-volatile method does. Direct dependents of volatile functions are always recalculated. Indirect dependents of volatile functions are not always recalculated. So when is it ok to call volatile functions? The basic rule is to avoid using volatile functions wherever possible. Use volatile functions: In moderation Using a couple of formulae that call volatile functions is not going to slow your calculation time considerably. When there is no alternative; or the alternative will add significant overhead to the calculation.

Events that Trigger Recalculation


On the main part calculation is invoked when you change the value in a cell that has a dependent (assuming you are working in automatic calculation mode), or when you hit F9. There are however a number of other triggers that you need be aware of. The following table lists some of these triggers. TRIGGER Autofilter Clicking row or column divider COMMENT Selecting any filter criteria will flag all of the formula in the autofilter range as uncalculated. Clicking a row or column divider will trigger recalculation. Manually changing the span of a row or column however will not trigger a recalculation. Any formulae that refer to other worksheets and any formula containing names that refers to other worksheets or to the current worksheet will become flagged as uncalculated. Any formulae that are referred to by formula in other worksheets will also become flagged as uncalculated. Renaming worksheets, deleting worksheets and changing the position of a worksheet in a workbook will trigger recalculation.

Inserting or deleting rows, columns or cells

Renaming, deleting and moving worksheets


Table 2-2 Recalculation Event Triggers

Calculation Methods
Normally Excel invokes recalculation when you change a cell value that has dependents. The calculation method this uses in recalculation. Shortcuts for invoking calculation: Full Calculation: Control+Alt+F9 Recalculation: F9 Selected Sheet(s) Only: Shift+F9

Calculating an individual formula, array formula, or part thereof: Select the formula in the formula bar, or only the portion you want to evaluate, and hit F9. The formula or part of the formula is replaced by the result. For an array formula you will see an array of the results, which is a great way of debugging an array formula. Formula Methods in Excel Jon von der Heyden 2011 Page 24

3. Data Types, Interpretation and Precision


Whenever you type something into a cell, Excel needs to interpret that value so that (1) it knows how to process the value when it is called in a formula, and (2) so that it knows how much memory to allocate for the storage of that value. Data types not only apply to values typed into cells; any value yielded by a formula will be of a certain data type, even the values in names will be of a certain data type.

Data Types
There are a variety of different data types but we are going to group all of the various types into four categories; numbers, text, booleans (also referred to as logicals) and errors. Data types define how the bytes of memory are used to hold the data, and what kind of data can be stored. Generally Excel determines the data type of a value, but we are given a relative amount of control over this. For instance, if you type 12345 into a cell, clearly Excel knows to treat this as a numeric value and thus Excel assigns this a number data type. However, if the cell is formatted as text, or you prefix the entry with an apostrophe, Excel will treat this as a text data type. Numbers When numbers are held in Excel that number is stored in eight bytes. It is the data type that also tells us that the number range at our disposal is finite. In addition to numbers that are obviously number data types, date and time values, although often represented textually, are also numbers. Unless specifically formatted otherwise, all number values will appear right aligned in a cell. It is suggested that you do not change the alignment of numbers in cells because it is a very good visual guide informing you whether or not a number is recognised as a number, or as a text value. Booleans A logical or boolean expression is one that evaluates to TRUE or FALSE. You can also manually type in boolean values directly into a cell, name or formula argument. Unless specifically formatted otherwise, all boolean values will appear centre aligned in a cell, and appear in uppercase. Errors Error values inform us when something has gone wrong! Although typically the result of a formula we can actually manually type in error values into a cell. Unless specifically formatted otherwise, all error values will appear centre aligned, appear in uppercase and be prefixed with the hash symbol (#).

Formula Methods in Excel Jon von der Heyden 2011

Page 25

ERROR #N/A

#VALUE! #NAME? #DIV/0 #NULL! #REF!

MEANS Excel cannot find a lookup value within a specified lookup table. It is likely that: The lookup value does not exist within the lookup table. The data type of the lookup value is not consistent with the entry in the lookup table. Your lookup value does not match the value in the lookup table. Check for leading and trailing spaces. Occurs when the wrong type of argument or operand is used. The error is most commonly yielded when attempting an arithmetical calculation using a text value. A function or name is not recognised. Usually the result of a typo. Result of an attempt to divide a number by zero. Occurs when you specify an intersecting range which in fact does not intersect. Result of an invalid reference in your formula. Occurs usually when you delete the physical reference, meaning that the reference in the formula has nothing to point to.

Table 3-1 List of error types

Text Generally a catchall for all other values not identified as belonging to one of the already mentioned data types. Unless specifically formatted otherwise, all text values will appear left aligned. Text values are actually ordered values, in that a text value can be equal to, less than or greater than another text value. For instance, using a comparative expression =A>Z will yield FALSE. =Z>A will yield TRUE.

Floating Point-Precision
Excel was designed in accordance to the IEEE Standard for Binary Floating-Point Precision. This standard defines how floating-point numbers are stored and calculated. The advantage of using floating-point representation over fixed-point representation is that it can support a wider range of values. For example, a fixed-point representation that has seven decimal digits with two decimal places can represent the numbers 12345.67, 123.45, 1.23 and so on. Floating-point representation with seven decimal digits, however, can in addition represent 1.234567, 123456.7, 0.00001234567, 1234567000000000 and so on. The number of digits of precision limits the accuracy of numbers. For example, the number 1234567890123456 cannot be exactly represented if 15 digits of precision are used. Excel uses 15 digits of precision. Loss of Precision When Using Very Large Numbers A 1 2 3 4 1.2E+200 1E+100 = SUM(A1:A2) = 1E+100

Table 3-2 Example loss of precision when using very large numbers

The resulting value in A3 is 1E+100, the same number in A2. At least 100 digits of precision would be required to accurately compute the result.

Formula Methods in Excel Jon von der Heyden 2011

Page 26

Loss of Precision When Using Very Small Numbers A 1 2 3 4 0.000123456789012345 1 = SUM(A1:A2) = 1.00012345678901

Table 3-3 Example loss of precision when using very small numbers

The resulting value in A3 is 1.00012345678901 instead of 1.000123456789012345. At least 19 digits of precision would be required to accurately compute the result.

Boolean Logic
Many users are already aware that boolean values can be represented with digital values. In Excel we can pass numerical values to logical function arguments and we can pass boolean values in expressions to be computed as digital values. The process Excel undergoes to convert boolean values to digital values, and vice versus, is referred to as coercion. Coercion In Excel, we can numerical values in formula to represent boolean values. Excel will recognise zero as FALSE and any non-zero number as TRUE. There is no explicit instruction needed to tell Excel to coerce zero to FALSE and a non-zero number to TRUE, it is a given when any such number is passed to a logical argument. Coercing a boolean to a digital value will represent FALSE as zero (unchanged) and TRUE as one. We do however need to be explicit when coercing a boolean to a digital value. A boolean is coerced to a digital value when it is used as an operand in an arithmetical expression. To yield the representative digital value we use an expression that will not change the numeric value of the digital value equivalent. A DIGITAL VALUE 1 0 1 0 1 0 B EXPRESSION =--A2 =A3+0 =A4-0 =A5*1 =A6/1 =A7^1 C RESULT 1 0 1 0 1 0 D E BOOLEAN VALUE TRUE FALSE TRUE FALSE TRUE FALSE F EXPRESSION =--E2 =E3+0 =E4-0 =E5*1 =E6/1 =E7^1 G RESULT 1 0 1 0 1 0

1 2 3 4 5 6 7

Table 3-4 Coercing boolean values to digital values

It is widely believed that using double negation (--) is the most optimised coercion method, because double negation appears first in the order of evaluation. This method can also be used to coerce an entire array of values. For instance, assume you have a comparative expression over an array of values, the next table illustrates.

Formula Methods in Excel Jon von der Heyden 2011

Page 27

1 2 3 4 5 6 7

A VALUES A B C A B C

B EXPRESSION =SUMPRODUCT(--(A2:A7="a")) Step 0.1 Step 1.0 Step 1.1 Step 2.0 Step 3.0 (RESULT)

C RESULT 2 =SUMPRODUCT(--({"a";"b";"c";"a";"b";"c"}="a")) =SUMPRODUCT(-({TRUE;FALSE;FALSE;TRUE;FALSE;FALSE})) =SUMPRODUCT(-({-1;0;0;-1;0;0})) =SUMPRODUCT({1;0;0;1;0;0}) =2

Table 3-5 Coercing an array of boolean values to an array of digital values

AND Logic AND logic yields TRUE when all comparative statements evaluate to TRUE. If any comparison evaluates to FALSE then AND logic dictates that the result must be FALSE. Multiplying comparative results with each other also serves as AND logic. BOOLEAN VALUES CONDITION A CONDITION B FALSE FALSE FALSE TRUE TRUE FALSE TRUE TRUE
Table 3-6 AND Logic Truth Table

A AND B FALSE FALSE FALSE TRUE

CONDITION A 0 0 1 1

DIGITAL VALUES CONDITION B 0 1 0 1

AxB 0 0 0 1

OR Logic OR logic yields TRUE when any one comparative statement of many yields TRUE. Adding comparative results with each other also serves as OR logic. BOOLEAN VALUES CONDITION B FALSE TRUE FALSE TRUE DIGITAL VALUES CONDITION B 0 1 0 1

CONDITION A FALSE FALSE TRUE TRUE

A OR B FALSE TRUE TRUE TRUE

CONDITION A 0 0 1 1

A+B 0 1 1 2

Table 3-7 OR Logic Truth Table

Formula Methods in Excel Jon von der Heyden 2011

Page 28

Date and Time Values


Excel stores dates as a number representing the number of days since 0 January 1900, and times as a fraction of a 24 hour day. These are referred to a serial dates and times. It is cell formatting that provides textual representation, but essentially dates are whole numbers and times are decimal values. Knowing that dates are numeric values allows us to handle date and time values constructively in formulae. For instance, the 4th of April 2010 has a numeric value of 40272. This is said because 40272 days have elapsed since 0 January 1900. This result is actually overstated because Excel interprets the year 1900 as a leap year (29 days in February); which it was not. For this reason, Excel allows us to switch to a different base, the 1904 data system. Here dates commence 0 January 1904. Whilst this system is theoretically more accurate, it is best to avoid using it. The 1900 date system allows greater compatibility with other systems. The time value 18H42 has a numeric value of 0.779166666666667. This can be validated using the following equation:

Formula Methods in Excel Jon von der Heyden 2011

Page 29

4. Introducing Worksheet Functions


Worksheet functions allow us to pass instruction to Excel on how to evaluate terms, and as such, a strict convention applies. 1. Firstly Excel needs to determine whether or not an entry into a range, or name, is an expression. This is assumed to be true: 1.1. When the entry / expression is prefixed with an equals = symbol or unary symbol such as plus + or minus -. AND; 1.2. If entered into a range and the range is not text formatted. 2. Excel splits the expression into the individual terms. It then analyses each term for a worksheet function by cross-referencing each whole word in the term against its function library. Note it does not assess words encapsulated in speech marks. 3. Most worksheet functions take arguments, parameters or inputs if you like. These arguments are contained within parenthesis. Therefore Excel always expects a worksheet function name to be suffixed with parenthesis. If the worksheet function takes arguments then these inputs must be contained within the parenthesis. The parenthesis must still be present even if the worksheet function does not take any arguments. If parenthesis is missing Excel will assume the component to be a name. 4. Where a worksheet function takes more than one argument (within the parenthesis), the arguments must be separated by a comma delimiter (note the actual delimiter depends on regional settings it is common to find arguments semi-colon delimited on the European continent). Excel knows to send this expression to the calculation engine because it is prefixed with an equals symbol = Excel recognises this worksheet function because it appears in the function library. Opening parenthesis. The arguments are contained within parenthesis. The first A comma The second argument ; separates argument; namely the X the namely the Y values. arguments. values. Closing parenthesis.

SUMPRODUCT

A2:A6

B2:B6

Table 4-1 Basic anatomy of a worksheet function

Data Type Conformity


All worksheet functions are configured to yield a result conforming to a certain data type. Those that dont are said to yield a variant data type. Similarly the values passed to the function arguments are also expected to conform to a predefined data type.

Formula Methods in Excel Jon von der Heyden 2011

Page 30

Taking this further, it comes as no surprise that the data type yielded by the SUM function is a number data type. It will also come as no surprise that the data types that SUM expects within its arguments should also be a number data type. But now bear in mind that certain worksheet functions are capable of processing arrays. An array, simply put, is a series of values. For example: A 1 2 3 4 5 6 7 8 X 0 1 2 3 4 = SUM(A2:A6) = 10

Here the instruction to Excel is to sum each value within the range A2:A6. In reality all that is happening in the background is that Excel is using this range to load values into an array. We can actually pass an array directly to the SUM function argument. For example: =SUM({1;2;3;4;5;6}) Here an array is qualified because the values are entered within curly parenthesis, specifically an inline array constant. In the example of SUM, we have already mentioned that Excel worksheet function expects the function arguments to conform to a predefined data type, and that the SUM function expects us to pass numerical values. So, when passing an array we should try to ensure that each array item (i.e. each value) conforms to the expected data type. This same rule applies to values contained within a range, where that range is passed to the function argument. In actual fact, the SUM function is very forgiving. If we include a text value within the array that it evaluates, SUM merely treats the text value as zero. This gives SUM a distinct advantage over using a classic addition expression. A 1 2 3 4 5 6 7 8 X 0 1 Y 3 4 = SUM(A2:A6) =8 B

= A2 + A3 + A4 + A5 + A6 = #VALUE!

Table 4-2 Demonstrating the distinct advantage of using SUM over a classic addition expression

The formula entered in B7 in figure 4.2 yields an error result. The #VALUE! error in this instance indicates the presence of a non-numerical value . Excel cannot add the text value in A4 to the addition of A2 and A3, hence each evaluation step beyond this point yields an error value. So far we have only briefly touched and explored the SUM and SUMPRODUCT functions. Currently, in Excel 2010, there are 331 common worksheet functions. This does not take into account additional worksheet functions at your disposal through addins and other external sources. To explore argument data type conformity we need to choose a different function. Let us explore a common favourite, VLOOKUP:

Formula Methods in Excel Jon von der Heyden 2011

Page 31

A 1 2 3 4 5 6 7 8 9 X A B C A = VLOOKUP(A6,A2:B4,2,FALSE) = 10 Y 10 100 1000 D

= VLOOKUP(B6,A2:B4,2,TRUE) = 1000

Table 4-3 VLOOKUP, exact match and approximate match syntax

We wont explore the VLOOKUP function in much depth now; that comes later. What is demonstrated here is data type conformity in the function arguments. The first argument expects a lookup value, i.e. the value sought in the table. In this example we are looking for the value a in the lookup table. In the case of VLOOKUP, the lookup value can be numeric, text or a logical value (essentially a variant data type). It would be rather futile to pass an array or inline array constant to this first argument because VLOOKUP expects a single value, and only the first array item will be taken into account. The second argument represents the table that the lookup value is sought within, and that the return value is contained within. VLOOKUP searches for the lookup value (i.e. the 1st argument) within the left-most column of the table. In our example our table is contained within a range, but it need not be. We could represent the table using an inline array constant, for instance: =VLOOKUP(a;{a,10;b,100;c,1000},2,FALSE) Notice that the inline array constant contains both comma and semi-colon separators. The comma represents a column partition and the semi-colon represents a row partition. So we can conclude that this inline array contains two columns and three rows, just as range A2:B4 is made up of two columns and three rows. Again this argument can take a variant data type, however VLOOKUP will always yield an error unless this argument is either a range or an array. The third argument indicates which column index to yield a value from, assuming the lookup value is found in the left-most column of the table. In this example the #2 refers to column B of the table. This argument can only accept an integer value. Excel wouldnt know how to interpret a text string. VLOOKUPs fourth and final argument is used to instruct Excel whether or not it should seek an exact match, or an approximate match. This can only ever be TRUE or FALSE, in other words a boolean value. So what happens if we pass anything other than a boolean? If you enter a text value you can expect to receive a #VALUE! error. Excel doesnt have a mechanism for coercing a text value to a boolean value. You can however pass a numeric value. It is not uncommon to see this argument expressed as 1 or 0 (zero). Excel will resolve the number to a boolean, meaning that strict data type is still applied. The number zero can be used to represent FALSE, and any non-zero number can be used to represent TRUE.

Formula Methods in Excel Jon von der Heyden 2011

Page 32

Nested Worksheet Functions


Next we address the topic of nested functions. Although worksheet function arguments need to conform to specific data types, this does not mean that we are restricted only to constant inputs or reference inputs. It is perfectly acceptable to nest a worksheet function, or any formula, within a function argument, provided the result of that nested function conforms to the expected data type. Let us explore this in a little more depth: A 1 2 3 4 5 6 7 8 9 10 11 Susan Bob Mary James Employee: Period: Expense Claimed: Period 1 0 0 600 125 Bob Period 2 = INDEX(B2:D5, MATCH(B7,A2:A5,0), MATCH(B8,B1:D1,0)) = 252 B Period 2 0 252 600 0 C Period 3 350 125 600 250 D

Table 4-4 Demonstrating nested worksheet functions within a formula

The INDEX worksheet function takes 3 arguments. We pass a table or array to the first argument. In this example the table is in a range, specifically B2:D5, the expense values only. The second argument tells Excel which Y coordinate, or row index, we want to return a value from. The third argument tells Excel which X coordinate, or column index, we want to return a value from. The intersection of the Y and X coordinate is the result of the INDEX formula. In the table above, we use the MATCH worksheet function to yield the Y coordinate, or the position of the Bob in A2:A5. We use the MATCH worksheet function to yield the X coordinate, or the position of Period 2 in B1:D1. The data type of the MATCH result can only be an integer or an error type (i.e. if no match is found then MATCH will yield #N/A).

Optional Arguments
As previously stated, not all worksheet functions take arguments. The TODAY worksheet function is a classic example. TODAY will always yield todays date by collecting the result from the system date. Of the functions that do rely on arguments occasionally some of these arguments are optional. An example of this can be observed with the VLOOKUP we explored earlier. The last argument, indicating whether or not an exact or approximate match is required, is optional. Where this is the case, and the argument is omitted in the formula, Excel will assume a default value. For example: =VLOOKUP(d,{a,10;b,100;c,1000},2) In this context Excel will assume that the omitted argument is TRUE, Excel is instructed to perform an approximate match. However; Formula Methods in Excel Jon von der Heyden 2011 Page 33

=VLOOKUP(d,{a,10;b,100;c,1000},2,) In this context Excel will assume that the omitted argument is FALSE, Excel is instructed not to perform an approximate match. It might not be obvious, the only difference between the former and the latter is that the latter contains a comma after the column index argument, meaning that the fourth argument has not actually been omitted but that Excel has not been explicitly told what the argument value is. It is considered best practise to be as explicit as possible when constructing your formula. Being explicit does not add any overhead to Excels calculations since omitted arguments will always revert to a default value. In fact, it has been suggested by some that being explicit reduces the overhead since Excel does not have to reference its library to establish the default value. Whether or not this is true the effects are so slight that they are difficult to substantiate.

Logical and Information Functions


Logical functions introduce decision making in Excel. They either yield TRUE of FALSE, or instruct Excel on how to arrive at a result if a condition is either TRUE or FALSE. Information functions answer specific questions and are usually prefixed with IS. In the context of this lesson we will only explore information functions that yield a TRUE or FALSE result. AND() Returns TRUE if all of its arguments are TRUE, otherwise yields FALSE. AND supports up to 30 logical arguments in Excel version 2003 and earlier, but up to 255 in later versions. Syntax: AND(logical1, logical2, ) OR() Returns TRUE if any of its arguments are TRUE, returns FALSE if all of arguments are FALSE. OR supports up to 30 logical arguments in Excel version 2003 and earlier, but up to 255 in later versions. Syntax: OR(logical1, logical2, ) Use arrays when analysing a single cell: When using OR to test only one cell value, an inline array constant can offer a touch of micro-optimisation. For instance: A B 1 Bob 2 3 =OR(A1=Mary,A1=Bob) This rendition involves 3 evaluation steps. 4 =OR(A1={Mary;Bob}) This rendition only involves two evaluation steps. NOT() Reverses the logic of its argument. Use NOT when you want to make sure a value is not equal to one particular value. Syntax: NOT(logical)

Formula Methods in Excel Jon von der Heyden 2011

Page 34

ISBLANK() Returns TRUE if the value is blank, otherwise returns FALSE. This function can mislead users. ISBLANK will yield FALSE when a value contains a null string, such as a formula configured to yield . You can also use the LEN function to determine if a value is empty, or contains a null string. Syntax: ISBLANK(value) ISNA() Returns TRUE if a value is a #N/A error, otherwise returns FALSE. Use ISERROR() to test if a value is of any error type. Syntax: ISNA(value) IF() Specifies a logical test to perform. Instructs Excel to yield a specific value if the 1st argument is TRUE, or another value if the 1st argument is FALSE. Syntax: IF(logical_test, value_if_true,value_if_false) When using IF, do not test if a comparative statement is TRUE or FALSE: How often do you see: IF((value1 > value2)=TRUE, do_this, do_that)? The statement something > something_else is a comparison statement and can only yield a TRUE or FALSE. Thus asking Excel to confirm that it is TRUE is an extra and entirely unnecessary evaluation step. When using IF, do not explicitly ask Excel whether a value is zero or not: Because Excel recognises zero as FALSE, and any non-zero numeric value as TRUE, it is entirely unnecessary to pass this sort of comparison statement in IF. For instance, IF(value<> 0, do_this, do_that) can simply be expressed as IF(value, do_this, do_that), saving an evaluation step. Avoid IF in logical numerical tests: The tables below attempts to illustrate using boolean logic to avoid function calls and reduce the evaluation steps to yield a result. A B C D 1 Pay 20% bonus on Revenue over 20K only where GP >= 30% 2 3 Profit Centre Revenue GP% Bonus 4 001589 26000 44% 1200 5 001523 19100 28% 0 6 001596 22000 28% 0 7 001508 11200 86% 0
Table 4-5 Boolean logic, multiplying logical tests to avoid function calls and evaluation steps.

The bonus in D4 can be calculated using a combination of IF() and AND(): IF(AND(B4>20000,C4>=0.3),(B4-20000)*0.2,0). This formula involves two function calls and 6 evaluation steps. The same result can be achieved using (B4>20000)*(B4-20000)*(C4>=0.3)*0.2, however this method involves no function calls with the same number of evaluation steps.

Formula Methods in Excel Jon von der Heyden 2011

Page 35

Lookup Functions
Lookups are of the most frequently used functions in the Excel function library. Unfortunately though, they are often the most likely cause of slow calculations. Fortunately there are a number of ways to improve lookup calculation times. LOOKUP() The LOOKUP function takes two forms, Vector or Array. The array version searches for a specific item in an array, and returns a value from the same position in the last column or row of the array. If multiple matches exist, LOOKUP returns the last match. The array must be sorted in ascending order. Error values in the array are ignored. Syntax: LOOKUP(lookup_val, array) Lookup the LAST item in a table: Typical lookup functions match the first occurrence of an item in a table. Lookup will return the last match. Say you a table of values in A1:A10, and you wish to yield an adjacent value from B1:B10, but should more than one occurrence exist, grab the last match: LOOKUP(1,1/(A1:A10=lookup_value),B1:B10) MATCH() The MATCH function searches for a specific item in a 1-dimensional array of values (e.g. range), and then returns the relative position of that item in the array. Syntax: MATCH(lookup_value, lookup_array, match_type) Match_type = 1 returns the largest match less than or equal to the lookup value if the lookup array is sorted in ascending order. Match_type = 0 requests an exact match. Match_type = -1 returns the smallest match greater than or equal to the lookup value if the lookup array is sorted in descending order.

VLOOKUP() The VLOOKUP function searches for a specific item in the left-most column of an array of values, and then returns a value from the same row from the desired column in the array. Syntax: VLOOKUP(lookup_value, lookup_array, col_index, match_type) Match_type = TRUE (or any non-zero number) returns the largest match less than or equal to the lookup value. The array must be sorted in ascending order. Match_type = FALSE (or zero) requests an exact match.

HLOOKUP() The HLOOKUP function searches for a specific item in the top-most row of an array of values, and then returns a value from the same column from the desired row in the array.

Formula Methods in Excel Jon von der Heyden 2011

Page 36

Syntax: HLOOKUP(lookup_value, lookup_array, row_index, match_type) Match_type = TRUE (or any non-zero number) returns the largest match less than or equal to the lookup value. The array must be sorted in ascending order. Match_type = FALSE (or zero) requests an exact match.

INDEX() The INDEX() function returns a value or the reference to a value from within an array or range. INDEX is a very versatile function indeed. Syntax: INDEX(array, row_index, column_index) Use INDEX and MATCH to perform left-to-right and bottom-to-top lookups: VLOOKUP always seeks a lookup value in the left-most column or an array of values. HLOOKUP always seeks a lookup value in the top-most row of an array of values. Using INDEX with MATCH can be used to yield a lookup result regardless of the orientation of the lookup value field in the array. A B C D 1 VALUES LOOKUP FIELD 2 59 A LOOKUP VALUE D 3 41 B 4 42 C FORMULA =INDEX(A1:A7,MATCH(E2,$B$1:$B$7,1),0) 5 27 D RESULT =27 6 52 E 7 68 F
Table 4-6 Performing a right-to-left lookup with INEX and MATCH

Use INDEX to yield a single dimension of a multi-dimensional array: Say you have a named range in your workbook, and that the range referred to is multi-dimensional (i.e. refers to both columns and rows). At some point you might need to refer to only one dimension of that range (i.e. a row or a column). Of course you could configure a named range for each dimension, but that only adds to the size and complexity of your model. INDEX can be used to refer to a specific column or row of the array. NAMED RANGE: Sales_Table REFERS TO: Sales!$A$1:$M$100 FORMULA: RETURNS: FORMULA: RETURNS: =INDEX(Sales_Table,0,2) Column B, Sales!$B$1:$B$100 =INDEX(Sales_Table,8,0) Row 8, i.e. Sales!$A$8:$M$8

Table 4-7 Yielding an intersecting range using INDEX

Formula Methods in Excel Jon von der Heyden 2011

Page 37

Use INDEX with a reference operator to yield a contiguous range: One of the properties of INDEX that makes it so versatile is that it can be used to return a value from a range or array, but also that it can yield a range. A B C D E 1 INVOICE AMOUNT 2 000016 438.16 FORMULA =SUM(B1:INDEX(B:B,MATCH(BigNum,B:B,1),0)) 3 000021 837.24 Step 1 =SUM(B1:INDEX(B:B,7,0)) 4 000025 77.42 Step 2 =SUM(B1:7) 5 000030 167.51 Step 3 (RESULT) =837.24 6 000071 675.53 7 000075 263.97
Table 4-8 Yielding a range using INDEX to return a range operand

CHOOSE() The CHOOSE() function uses an index number to return a value from a list of arguments. Syntax: CHOOSE(index_num,1st_arg,2nd_arg,) Use CHOOSE to construct an array of calculated values: One unconventional use of choose is to feed an array to the index_num argument, which causes CHOOSE to yield an array of the corresponding argument values. Hence CHOOSE({1;2;3},A1+A2,A2+A3,A3+A4) will yield an array {A1+A2;A2+A3;A3+A4}. Use CHOOSE to create a right-to-left VLOOKUP: Using the method described previously, we can construct a 2 dimensional array from two ranges. VLOOKUP(lookup_val,CHOOSE({1,2},B1:B100,A1:A100),2,TRUE) will look for lookup_value in column B but return the adjacent value from column A. Further Lookup Tips Always avoid using exact match wherever possible. If you are doing lookups using the exact match option the calculation time for the function is directly proportional to the number of cells scanned before a match is found. This might add significant overhead to calculations when doing lookups over large ranges. Lookup calculation speed using the approximate match method on ordered data is considerably quicker. Try to keep all lookup references in the same sheet. Using lookups to search for an item on a different sheet is considerable slower. Do however keep data on separate sheets if it makes the model that much easier to understand and maintain. Avoid using oversized lookup ranges. When doing exact match lookups, restrict the range of cells to be scanned to a minimum. Consider using dynamic range names instead. You can use approximate match even if your lookup value might be missing from the array. If you order your data ascending but still are not certain that the lookup value is in the lookup array, then apply the following approach: =IF(COUNTIF(array, lookup_value),VLOOKUP(lookup_value, lookup_array, col_index, TRUE),NA())

Formula Methods in Excel Jon von der Heyden 2011

Page 38

Avoid unnecessary function calls when handling errors. The typical approach to handling lookup errors with ISNA() and ISERROR() is not the most efficient. Instead use the following approach: A B C 1 X 10 Result1 2 Y 100 Result2 3 Z 1000 Result3 4 5 For numbers: =LOOKUP(9.99E+307,CHOOSE({1;2},0,VLOOKUP(a,A1:B3,2,0))) 6 For text: =LOOKUP(REPT(Z,255),CHOOSE({1;2},,VLOOKUP(a,A1:C3,3,0)))
Table 4-9 Handling lookup error values

At this point you might be wondering what the purpose of using 9.99E+307 and REPT(Z,255) is? The former is just a really, really big number and gives relative assurance that the formula will yield the last number in the range. The later yields a string of 255 Z characters, and also gives relative assurance that the formula will yield the last text value in the range. If you are going to frequently rely on these then it is suggested that you assign each to a defined name: BigNum: =9.99999999E+307 BigText: =REPT(Z,255) Binary Search versus Linear Search The point of using approximate match in lookups is iterated a number of times throughout this document. You might be wondering why an approximate match is so much faster than an exact match. In the case of an exact match Excel is instructed to perform a linear search. Excel does not expect the range to be ordered; therefore the size of the range to evaluate is directly in proportion to the position of the lookup value in the lookup range. VLOOKUP, HLOOKUP and MATCH will search for the lookup value in the table, iterating through each array item in the relevant range until it find the match. Once it finds the match it stops searching immediately (i.e. these functions are configured to find the first match only). However, in the case of an approximate match, Excel is instructed to perform a binary search. Excel expects the array to be ordered. Imagine, if you will, looking for a specific listing in a very large telephone directory book. You want to find the telephone number for Sams Delivery Service. If the telephone directory was not ordered A-Z you would literally need to start with the first entry and scan each entry until you find the listing. The time it will take for you to find the entry will be proportionate to the position of Sams Delivery Service in the directory. Fortunately telephone directories are ordered. In reality you would take a punt at where the letter S is in the book. Lets say you guess wrong and you open listings beginning with the letter P. Perhaps you would fold over A-P and have a second punt, this time striking the letter U. Now you know that the S listings are somewhere between your first and second punt, so you would iterate this process until you find the specific listing. What Excel does to yield a result with a binary search is not far different. Firstly Excel will divide the array into two equal arrays. It will analyse the first array item of the second array and determine if the array item is smaller than, equal to, or greater than the lookup value. If the item is smaller than the lookup value then it knows to seek the second array. If the array item is larger than the lookup value then it knows to seek the first array. It will reiterate this process, each time narrowing down the search range, until a match is found. If no match is found then it will return the nearest approximate match. The definition of Formula Methods in Excel Jon von der Heyden 2011 Page 39

approximate match may vary dependent on the lookup function used and what instruction was passed in the function, see the individual lookup functions content for more details.

Math and Statistical Functions


The volume of functions in these two categories is huge, so we are only going to skim the surface and focus on some of the most frequently used and handy functions. ROUND() Rounds a number to a specified number of digits. Syntax: ROUND(number, num_digits) Use ROUND to round a number to the nearest multiple: A B C 1 NUMBER SIGNIFICANCE FORMULA 2 10 3 =ROUND(A2/B2,0)*B2 3 5.3 0.5 =ROUND(A3/B3,0)*B3 4 7.24 0.05 =ROUND(A4/B4,0)*B4
Table 4-10 Rounding to the nearest desired multiple using ROUND

D RESULT 9 5.5 7.25

MROUND() Returns a number rounded to the nearest desired multiple of significance. In versions 2003 and prior, this function is only available through the Analysis Toolpak. Use the ROUND method described if you are using Excel 2003 or prior, or if your workbook is to be viewed and used by users running Excel 2003 or prior. Syntax: MROUND(number, significance) ROUNDUP() Rounds a number up, away from zero. ROUNDUP can be used in the same context described in ROUND. Syntax: ROUNDUP(number, num_digits) CEILING() Rounds a number up, away from zero, to the nearest multiple of significance. Syntax: CEILING(number, significance) Manage the sign of the significance multiple carefully: A B C 1 NUMBER SIGNIFICANCE FORMULA 2 -1.4 -0.5 =CEILING(A2,B2) 3 -1.4 0.5 =CEILING(A3,B3) 4 1.4 0.5 =CEILING(A4,B4)
Table 4-11 Rounding up to the nearest desired multiple using CEILING

D RESULT -1.5 1 1.5

Formula Methods in Excel Jon von der Heyden 2011

Page 40

ROUNDDOWN() Rounds a number down, towards zero. ROUNDDOWN can be used in the same context described in ROUND. Syntax: ROUNDDOWN(num, num_digits) FLOOR() Rounds a number down, away from zero, to the nearest multiple of significance. Syntax: FLOOR(number, significance) INT() Returns the integer of a number, or, rounds a number down to the nearest integer. Syntax:INT(number) Use INT to yield the date only portion of a date & time stamp: In Excel dates are always whole numbers, where each day is a value of 1. Time is always a decimal value, since 1 hour is of a day (0.4166666667). A B C D 1 DATE & TIME FORMULA RESULT (number) RESULT (dd/mm/yy) 2 25/03/2011 11:48 =INT(A2) 40627 25/03/2011
Table 4-12 Extracting a date from a date and time stamp

MOD() Returns the remainder after a number has been divided by a divisor. Syntax: MOD(number, divisor) Use MOD to yield the time only portion of a date and time stamp: Removing the integer component of a date and time stamp will leave only the decimal value, which also happens to be the time value. A B C D 1 DATE & TIME FORMULA RESULT (number) RESULT (dd/mm/yy) 2 25/03/2011 11:48 =MOD(A2,1) 0.491823842596204 11:48
Table 4-13 Extracting the time from a date and time stamp

Formula Methods in Excel Jon von der Heyden 2011

Page 41

Use MOD to aggregate each nth item in an array: The following table demonstrates a stepped approach to summing up each 4th sales item (in this exhibit the quarterly result). A B C D E F G H I 1 Jan Feb Mar Q1 Apr May Jun Q2 2 Sales 26,238 34,131 30,600 90,969 34,536 28,270 25,146 87,952 3 4 N 4 5 6 MOD result 1 2 3 0 1 2 3 0 7 MOD formula =MOD(COLUMNS($1:1),$B$4) MOD formula in B6 copied across through to I6 8 9 TOTAL result 178,921 10 TOTAL formula =SUMIF(B6:I6,0,B2:I2)
Table 4-14 Summing the nth item in an array using MOD; a stepped approach

MAX() Returns the maximum value in an array of numeric values. MAX supports up to 30 arguments in Excel version 2003 and earlier, but up to 255 in later versions. Syntax: MAX(number1, number2, ) Use MAX and MIN to avoid unnecessary IF function calls: The steps involved in resolving a MAX or MIN statement are fewer than resolving an IF statement. Here are some examples of how one can avoid unnecessary IF function call using MAX and MIN: A B C D E 1 VALUE 1 VALUE 2 IF formula MAX formula MAX result 2 8 =IF(A2>0,A2,0) =MAX(A2,0) 3 -10 7 IF(MIN(A3,B3)<0,0,MIN(A3,B3)) =MAX(0,MIN(A3,B3)) 0
Table 4-15 Using MIN and MAX to avoid IF function calls

MIN() Returns the minimum value in an array of numerical values. MIN supports up to 30 arguments in Excel version 2003 and earlier, but up to 255 in later versions. Syntax: MIN(number1, number2, ) LARGE() Returns the nth largest value in an array. You can use this function to select a value based on its relative standing, or rank. Syntax: LARGE(array, n)

Formula Methods in Excel Jon von der Heyden 2011

Page 42

Aggregate the top n values in an array: A common requirement is to know the total of the largest n values in a list of values. This exhibit demonstrates using the SUM function but this can be substituted with AVERAGE and various other aggregate functions. A B 1 SALES 2 Order1 356 3 Order2 475 4 Order3 698 5 Order4 180 6 Order5 523 7 Order6 411 8 Order7 647 9 Order8 356 10 11 TOTAL 3,290 12 TOP 3 1,868 13 =SUM(LARGE(B2:B9,{1;2;3}))
Table 4-16 Summing the top n values in an array using SUM and LARGE

SMALL() Returns the nth smallest value in an array. You can use this function to select a value based on its relative standing, or rank. Syntax: SMALL(array, n) SUMPRODUCT() Returns the sum of the products of corresponding array components. SUMPRODUCT is one of the most heavily depended on functions, and one of the most versatile. But it is also quite inefficient and should be used in moderation or where no other alternative is available, or if the alternative is equally or more inefficient. SUMPRODUCT supports up to 30 arguments in Excel version 2003 and earlier, but up to 255 in later versions. Syntax: SUMPRODUCT(array1, array2, array3,) Sum or Count a range of values based one or more criteria with SUMPRODUCT. SUMPRODUCT has long been used in a similar context to SUMIF and COUNTIF. SUMIF and COUNTIF only support one criterion. SUMPRODUCT can support many more criteria. SUMIFS and COUNTIFS were introduced in Excel 2007 to support multiple criteria. Only use SUMPRODUCT to sum or count items in a range based on criteria if: 1. You need to reference an external workbook and need it to recalculate results even when that external workbook is closed. COUNTIF(S) and SUMIF(S) do not work on closed external workbooks. 2. You are using Excel 2003 or a prior version, or your workbook will be viewed and maintained by Excel 2003 and prior version users.

Formula Methods in Excel Jon von der Heyden 2011

Page 43

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

A CATEGORY1 A B C A B C CATEGORY1 CATEGORY2 COUNT

B CATEGORY2 X Y X Y Y X B Y 2 =SUMPRODUCT(--(A2:A7=B9),--(B2:B7=B10)) 250 =SUMPRODUCT(--(A2:A7=B9),--(B2:B7=B10),C2:C7) TOTAL 140 100 110 70 150 100

SUM

Table 4-17 Sum or Count a range using multiple criteria with SUMPRODUCT

Further advice on SUMPRODUCT: In Excel 2003 versions and prior, SUMPRODUCT will not support the use of whole column references or whole row references (e.g. A:A or 1:1). However, regardless of which Excel version applies, whole column and whole row reference should never be used as it adds significant overhead to the calculation. One should limit range references to the size of the range or use dynamic range names. SUMPRODUCT will yield a #VALUE! error if any of its references differ in size. If an error value exists within any of the array arguments SUMPRODUCT will yield that same error value.

COUNTIF() Counts the number of cells within a range that meet the given criteria. COUNTIF supports the use of comparison operators in its criteria arguments, and also supports the use of wildcard operators. COUNTIF is an incredibly fast calculating formula, but it also has its limitations. The limitations are largely the same for SUMIF. See SUMIF for more information. Syntax: COUNTIF(range, criteria) Find duplicates in a range using COUNTIF: Although there are other (possibly better) techniques for identifying distinct and duplicated values in a range, COUNTIF often appears as a quick and easy [and therefore preferred] method. This method involves counting the occurrences of the value based on its relative position in the table.

Formula Methods in Excel Jon von der Heyden 2011

Page 44

A 1 2 3 4 5 6 7 8 9 10 11 NAME Jon Sue Bob Jon Mary Bob

B EMPLOYEE # 0000123 0000186 0000164 0000123 0000177 0000164

C OCCURRENCE 1 1 1 2 1 2

D DUPE / DISTINCT Distinct Distinct Distinct Duplicate Distinct Duplicate

Formula in C2 (copied down): Formula in D2 (copied down):

=COUNTIF($B$2:$B2,$B2) =IF(C2-1,Duplicate,Distinct)

Table 4-18 Identifying duplicates in a range of values using COUNTIF

Check if a cell conforms to a pattern: Sometimes it is necessary to check if a value or text string is contained in a larger text string. For instance, in this example we look for all names that contain a middle name or initial (perhaps with the intention of identifying duplicate names). A B C 1 FULL NAME HAS MIDDLE NAME HAS MIDDLE INITIAL 2 Susan Elizabeth Jones 1 0 3 Peter Brown 0 0 4 Thomas Stephen Steele 1 0 5 James Crane 0 0 6 Fiona C Wood 0 1 7 Matthew Elliot Gray 1 0 8 George Harper 0 0 9 10 =COUNTIF(A10,"* ??* *") =COUNTIF(A10,"* ? *")

SUMIF() Adds the cells specified by a given criteria. SUMIF supports the use of comparison operators in its criteria arguments, and also supports the use of wildcard operators. SUMIF is an incredibly fast calculating formula, but it also has its limitations. The sum range must have the same number of rows and columns as the criteria range argument, although it does not have to be adjacent to the criteria range. Syntax: SUMIF(criteria_range, criteria, sum_range) Sum values in a range with multiple criteria in the same criteria range: SUMPRODUCT is occasionally misused because users arent aware that SUMIF (and COUNTIF) can support miltconditional evaluation, provided that the criteria are sought in the same criteria range. A ACCOUNT 0001558 0001875 0001622 0001526 0001752 B AMOUNT 645.05 459.62 730.60 432.01 568.04 C D CRITERIA 0001526 0001558 FORMULA RESULT E

1 2 3 4 5 6

=SUMPRODUCT(SUMIF(A2:A6,D2:D3,B2:B6)) 1,077.06

Table 4-19 Sum values in a range based on multiple criteria in the same criteria range

Formula Methods in Excel Jon von der Heyden 2011

Page 45

Note, the result in this exhibit can also be achieved with =SUMIF(A2:A6,D2,B2:B6)+SUMIF(A2:A6,D3,B2:B6). Using the version in the exhibit however can help avoid unnecessary function calls, shortens the formula and offers faster calculation. Remember, that with an inline array constant array criteria argument Excel knows automatically that the formula is to be treated as an array formula. This means the SUM function would be a sufficient replacement over SUMPRODUCT. Therefore this can also be expressed as =SUM(SUMIF(A2:A6,{0001526,0001558},B2:B6)). Sum values that correspond to empty cells: In some models a column is used to represent a corresponding value, and cells in the column are left blank where they correspond to sub-totals. This method might prove useful to sum sub-total values. The equals to comparative operator is used to represent empty cells. A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ACCOUNT 0001526 0001558 AMOUNT 645.05 432.01 1,077.06 459.62 232.22 691.84 730.60 568.04 1,298.64 =SUMIF(A2:A12,"=",B2:B12) 3,067.54 B

0001875 0001879

0001622 0001684

Total formula Total result

Table 4-20 Summing values that correspond to empty cells using SUMIF

Sum values that correspond to non-empty cells: Using the previous exhibit, this time we are adding cells that correspond to non-empty cells. The not equal to comparative operator is used to represent non-empty cells. A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ACCOUNT 0001526 0001558 AMOUNT 645.05 432.01 1,077.06 459.62 232.22 691.84 730.60 568.04 1,298.64 =SUMIF(A2:A12,"<>",B2:B12) 3,067.54 Page 46 B

0001875 0001879

0001622 0001684

Total formula Total result

Formula Methods in Excel Jon von der Heyden 2011

Table 4-21 Summing cells that correspond to non-empty cells using SUMIF

Sum values between two dates: This is quite a popular question on the forums. You have a date range with corresponding values, and you wish only to sum the values between two given dates. The math is actually very simply; first we sum the values that correspond to dates greater than or equal to the start date, and then subtract the values that correspond to the dates greater than the end date. A B 1 DATE AMOUNT 2 04-Dec-10 3,233.64 3 05-Dec-10 7,715.04 4 31-Dec-10 5,158.13 5 09-Jan-11 2,986.90 6 11-Jan-11 6,780.95 7 11-Jan-11 7,129.93 8 21-Jan-11 1,841.97 9 08-Feb-11 6,471.18 10 25-Mar-11 9,522.12 11 12 Date1 01-Jan-11 13 Date2 31-Jan-11 14 15 Total formula =SUMIF(A2:A10,">="&B12,B2:B10)-SUMIF(A2:A10,">"&B13,B2:B10) 16 Total result 18,739.75
Table 4-22 Sum values between two dates using SUMIF

A better function to use would be SUMIFS, assuming you have Excel version 2007 or later. Another option would be to use SUMPRODUCT, but even a single SUMPRODUCT formula would be less efficient than two SUMIF function calls. Offset the sum range reference to add nonadjacent cells: The dimension of the sum range must be the same size as the dimension of the criteria range, but that does not mean that they must be adjacent to each other. A B C 1 Profit Centre 00055512 2 Sales 3 18000 4 Costs 5 1600 6 7 Profit Centre 00055518 8 Sales 9 2400 10 Costs 11 1800 12 13 14 Total Sales 20400 =SUMIF(B2:B10,"=Sales",B3:B11) 15 Total Costs 3400 =SUMIF(B2:B10,"=Costs",B3:B11)
Table 4-23 Offsetting the sum range in SUMIF

Formula Methods in Excel Jon von der Heyden 2011

Page 47

Further advice on SUMIF(S) and COUNTIF(S): You can only pass range references to SUMIF(S) and COUNTIF(S) criteria range and sum range arguments. SUMPRODUCT supports arrays in its arguments. Note, you can pass an array to SUMIF(S) and COUNTIF(S) criteria argument. SUMIF(S) and COUNTIF(S) will not compute results when referencing external workbooks, and where the external workbook is closed. Instead it will yield #REF! error value. Evaluation speed is not slowed down when referencing whole columns or whole rows, unlike SUMPRODUCT where the speed is directly proportional to the size of the arrays passed to its arguments. SUMIF and COUNTIF only support a maximum of one criteria range.

COUNTIFS() Applies criteria to cells across multiple ranges and counts the number of times all criteria are met. COUNTIFS is only available in Excel 2007 and later versions. Up to 127 range/criteria pairs are allowed. Each additional range must have the same number of rows and columns as the criteria_range1 argument although ranges do not have to be adjacent to each other. The use of this function, and its limitations, are already described in the SUMIF and COUNTIF topic. Syntax: COUNTIFS(criteria_range1, criteria1, criteria_range2, criteria2, )

SUMIFS() Adds the cells in a range that meet multiple criteria. SUMIFS is only available in Excel 2007 and later versions. The order of arguments differs between the SUMIFS and SUMIF functions. In particular, the sum range argument is the first argument in SUMIFS, but it is the third argument in SUMIF. If you are copying and editing these similar functions, make sure you put the arguments in the correct order. Syntax: SUMIFS(sum_range, criteria_range1, criteria1, criteria_range2, criteria2,)

Text Functions
Excels text functions allow for the manipulation of text values, and they are very useful even in mathematical formula. TRIM() Removes all spaces from text except for single spaces between words. Use TRIM on text that you have received from another application that may have irregular spacing, or on text strings that you suspect might have leading or trailing spaces. Syntax: TRIM(text)

Formula Methods in Excel Jon von der Heyden 2011

Page 48

LEN() LEN returns the number of characters in a text string. Syntax: LEN(text) REPLACE() REPLACE replaces part of a text string, based on the number of characters you specify, with a different text string. Syntax: REPLACE(old_text, start_num, num_chars, new_text) SUBSTITUTE() This function substitutes new text for old text in a text string. Use SUBSTITUTE when you want to replace specific text in a text string; use REPLACE when you want to replace any text that occurs in a specific location in a text string. Syntax: SUBSTITUTE(text, old_text, new_text, instance_num) MID() MID returns a specific number of characters from a text string, starting at the position you specify, based on the number of characters you specify. Syntax: MID(text,start_num, num_chars) Use MID to drop leading characters instead of RIGHT: A typical approach to dropping leading characters might be use a combination of the RIGHT and LEN functions. Using MID or REPLACE can avoid an unnecessary function call, although the MID method is likely to be more efficient. Note that the REPLACE method instinctively trims the text. A B C D 1 RIGHT/LEN MID REPLACE 2 ; Drop the 1st two chars! =RIGHT(A2,LEN(A2)-2) =MID(A2,3,255) =REPLACE(A2,1,2,"") 3 4 RESULT Drop the 1st two chars!
Table 4-24 Dropping leading characters with MID and REPLACE

LEFT() LEFT returns the first character or characters in a text string, based on the number of characters you specify. Syntax: LEFT(text, num_chars) RIGHT() RIGHT returns the last character or characters in a text string, based on the number of characters you specify. Syntax: RIGHT(text, num_chars)

Formula Methods in Excel Jon von der Heyden 2011

Page 49

FIND() FIND locates one text string within a second text string, and returns the number of the starting position of the first text string from the first character of the second text string. FIND is case sensitive and does not support wildcard characters. SEARCH supports case insensitive evaluation and also supports wildcard characters, but is considerably slower. Syntax: FIND(find_text, within_text, start_num) Use FIND in a case insensitive context: Whilst SEARCH support case insensitive evaluation, it is also considerably slower, so much that using FIND with a couple of extra functions calls (UPPER or LOWER) is still known to be better. A B 1 Find: Text 2 Within: Some text here 3 4 Formula: =FIND(UPPER(B1),UPPER(B2)) 5 Result: 6

SEARCH() SEARCH is the case insensitive equivalent to FIND, but is considerably slower. Syntax: SEARCH(find_text, within_text, start_num) EXACT() EXACT compares two text strings and returns TRUE if they are exactly the same, FALSE otherwise. EXACT is case-sensitive but ignores formatting differences. EXACT can also be used to compare numeric values and it is useful when used to compare an array of values against another. Syntax: EXACT(text1, text2)

Date Functions
When dealing with dates Excel has a collection of very useful functions. Some of these functions were only available in Excel 2003 and earlier versions through the Analysis Toolpak. Excel included the full collection of functions in the standard function library since Excel 2007. When catering for Excel 2003 and earlier, it is often best to use alternative methods in case the users do not have the Analysis Toolpak installed or available. DATE() DATE returns the serial number of a particular date. Syntax: DATE(year, month, day) Return a serial date exactly n months before or after a specified date: Because the number of days in the months vary it is difficult to use traditional arithmetic to compute an exact date a certain number of months on (or before) a given date. The EDATE function can be used to achieve the same but in Excel 2003 and earlier versions is only available through the analysis toolpak. Formula Methods in Excel Jon von der Heyden 2011 Page 50

A 1 2 3 4 Date + Months - Months

B 27/03/2011 2 7

D RESULT 27/05/2011 27/08/2010

E FORMULA =DATE(YEAR(B1),MONTH(B1)+B3,DAY(B1)) =DATE(YEAR(B1),MONTH(B1)-B4,DAY(B1))

Table 4-25 Return a serial date exactly n months before or after a specified date

Return the first day and the last day of the month for a given date: Another popular question on the forums The typical requirement is to compute the last day of the month for a given date for the purpose of working out the number of networking days in the month. A 1 2 3 4 5 6 Date First Day: Last Day: Networking Days
st

B 27/03/2011 01/03/2011 31/03/2011 23

=B1-DAY(B1-1) =DATE(YEAR(B1),MONTH(B1)+1,0) =NETWORKDAYS(B3,B4,Holidays)

Table 4-26 Return the 1 and last day of the month of a given date

EDATE() EDATE returns the serial number that represents the date that is the indicated number of months before or after a specified date (the start date). Use EDATE to calculate maturity dates or due dates that fall on the same day of the month as the date of issue. EDATE is housed in the Analysis Toolpak function library in Excel 2003 and earlier versions. Syntax: EDATE(start_date, months) EOMONTH() EOMONTH returns the serial number for the last day of the month that is the indicated number of months before or after start date. Use EOMONTH to calculate maturity dates or due dates that fall on the last day of the month. EOMONTH is housed in the Analysis Toolpak function library in Excel 2003 and earlier versions. Syntax: EOMONTH(start_date, months)

Formula Methods in Excel Jon von der Heyden 2011

Page 51

DATEDIF() The DATEDIF function calculates the difference between two dates in a variety of different intervals, such as the number of years, months, or days between the dates. This function is available in all versions of Excel since at least version 95, but is documented in the help file only for Excel 2000. Syntax: DATEDIF(start_date, end_date, interval) INTERVAL M D Y Ym Yd Md MEANING Months Days Years Months Excluding Years Days Excluding Years Days Excluding Years And Months DESCRIPTION Complete calendar months between the dates. Number of days between the dates. Complete calendar years between the dates. Complete calendar months between the dates as if they were of the same year. Complete calendar days between the dates as if they were of the same year. Complete calendar days between the dates as if they were of the same month and same year.

Table 4-27 DATEDIF interval values

WEEKNUM() WEEKNUM returns the week number of a serial date. The base for WEEKNUM is the 1st of January. The return type argument tells Excel which day the week begins; with the default being 1 (Sunday). WEEKNUM is housed in the Analysis Toolpak function library in Excel 2003 and earlier versions. Syntax: WEEKNUM(serial_date, return_type) NETWORKDAYS() NETWORKDAYS returns the number of whole working days between start date and end date. Working days exclude weekends and any dates identified in holidays. NETWORKDAYS is housed in the Analysis Toolpak function library in Excel 2003 and earlier versions. Syntax: NETWORKDAYS(start_date, end_date, holidays) WORKDAY() WORKDAY returns a number that represents a date that is the indicated number of working days before or after a start date. Working days exclude weekends and any dates identified as holidays. WORKDAY is housed in the Analysis Toolpak function library in Excel 2003 and earlier versions. Syntax: WORKDAY(start_date, num_days, holidays)

Database Functions
Database functions are used to query databases, or better put, column-oriented tabular data with field names. Database functions are quick calculating formula and they definitely have a part to play in optimising complex models. The only major drawback is that criteria must be housed in a range and these criteria tables can become difficult to maintain and might occupy a considerable range. Database functions do not support array arguments.

Formula Methods in Excel Jon von der Heyden 2011

Page 52

The syntax of D functions (as they are commonly referred to as) are standard: Syntax: [DFunction](database, field, criteria) Database: This must be a reference to a database, i.e. where each column represents a field, a column-oriented data table. Field: Indicates which column in the database is to be aggregated in the function. This can be the field name, entered in speech marks, or the field index number. Criteria: This must also be a reference, either a range reference of a named range reference. You can use any range for the criteria argument, as long as includes at least one field name and at least one criteria value for the given criteria field.

DSUM() DSUM adds values in a field (column) in a list or database that match specified criteria. DSUM can be used as an alternative to certain SUMIF(S) and SUMPRODUCT methods. DAVERAGE() DAVERAGE computes the mean of values in a field (column) in a list or database that match specified criteria. DAVERAGE can be used as an alternative to certain AVERAGEIF(S) and other average array formula methods. DCOUNT() DCOUNT counts values in a field (column) in a list or database that match specified criteria. DCOUNT can be used as an alternative to certain COUNTIF(S) and SUMPRODUCT methods. DGET() DGET extracts a single value from a field (column) in a list or database that match specified criteria. DGET can be used as an alternative to various lookup methods. DMAX() DMAX returns the largest number from a field (column) in a list or database that match specified criteria. DMIN() DMIN returns the smallest number from a field (column) in a list or database that match specified criteria.

Formula Methods in Excel Jon von der Heyden 2011

Page 53

Database Function Examples

A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Criteria: Region =NWE Database: Region NWE NWE NWE SWE SWE SWE Results NWE Sales: SUM COUNT AVERAGE MIN MAX

B Country Sales

C COGS

Country United Kingdom Norway Finland Spain Portugal France

Sales 6,403 9,778 3,938 6,110 2,736 7,177

COGS 3,841 3,813 118 61 465 6,889

20,119 3 6,706 3,938 9,778

=DSUM(A6:D12,"Sales",A2:D3) =DCOUNT(A6:D12,3,A2:D3) =DAVERAGE(A6:D12,"Sales",A2:D3) =DMIN(A6:D12,3,A2:D3) =DMAX(A6:D12,"Sales",A2:D3)

Table 4-28 Aggregating results with D Functions with a single criterion

A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Criteria: Region

B Country =Norway =Finland Country United Kingdom Norway Finland Spain Portugal France Sales

D COGS

Database: Region NWE NWE NWE SWE SWE SWE Results Baltic Sales: SUM COUNT AVERAGE MIN MAX

Sales 6,403 9,778 3,938 6,110 2,736 7,177

COGS 3,841 3,813 118 61 465 6,889

13,716 2 6,858 3,938 9,778

=DSUM(A6:D12,"Sales",A2:D4) =DCOUNT(A6:D12,3,A2:D4) =DAVERAGE(A6:D12,"Sales",A2:D4) =DMIN(A6:D12,3,A2:D4) =DMAX(A6:D12,"Sales",A2:D4)

Table 4-29 Aggregating results with D Functions using multiple criteria (OR logic)

Formula Methods in Excel Jon von der Heyden 2011

Page 54

A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Criteria: Region =NWE Database: Region NWE NWE NWE SWE SWE SWE Results NWE Sales > 5000: SUM COUNT AVERAGE MIN MAX Country

B Sales >5000

D COGS

Country United Kingdom Norway Finland Spain Portugal France

Sales 6,403 9,778 3,938 6,110 2,736 7,177

COGS 3,841 3,813 118 61 465 6,889

16,181 2 8,091 6,403 9,778

=DSUM(A6:D12,"Sales",A2:D3) =DCOUNT(A6:D12,3, A2:D3) =DAVERAGE(A6:D12,"Sales", A2:D3) =DMIN(A6:D12,3, A2:D3) =DMAX(A6:D12,"Sales", A2:D3)

Table 4-30 Aggregating results with D Functions using multiple criteria (AND logic)

Formula Methods in Excel Jon von der Heyden 2011

Page 55

5. Dynamic Named Ranges


As mentioned previously in this document, a name can be used to house a formula, house a constant or reference a range. We have also already mentioned that the INDEX worksheet function can be used to yield a range reference. Therefore we can define names, and use a formula method including INDEX to yield a dynamic range. A dynamic range is a range reference that grows when new information is added to a table, and shrinks when information is removed from a table. The range might return a reference to a single column or row, or it might include reference to both.

When to Use Dynamic Named Ranges


There are plenty of practical reasons to use a dynamic named range. You can use the same formula methods described directly on the worksheet, but often it is practical to embed them into defined names. Use dynamic named ranges: When it is likely that data will be added to or removed from a table and; When you need to repeatedly call the range reference in formula, pivot tables and charts and/or; When the range reference is likely to grow or shrink and you need to reference the range in array formulae.

One-Dimensional Dynamic Range


You are developing a model that you foresee continual use over a number of years. This model contains date formula, specifically NETWORKDAYS and WORKDAY functions and you need to exclude holidays from these calculations. Therefore you house a single column list of bank holiday dates. This list can change each year (such as an additional holiday in 2011 The Royal Wedding). A 1 2 3 4 5 6 7 8 9 10 11 12 HOLIDAY SERIAL DATE 27/12/2010 28/12/2010 29/12/2010 30/12/2010 03/01/2011 22/04/2011 25/04/2011 29/04/2011 02/05/2011 30/05/2011 29/08/2011 EVENT Christmas Break Christmas Break Christmas Break Christmas Break New Year's Day Good Friday Easter Monday Royal Wedding May Day Bank Holiday Spring Bank Holiday Late Summer Holiday B

Table 5-1 Dynamic Table of Holiday Dates

For the purpose of calculation in NETWORKDAYS and WORKDAY, the only reference that you need to pass is the serial dates, in this exhibit located in A2:A12. Before you define a new dynamic range you need to consider; will the range only contain numeric values? Or will the range only ever contain text values? Or will it contain a mixture of the two data types?

Formula Methods in Excel Jon von der Heyden 2011

Page 56

Dynamic Ranges Numbers Only In the exhibit described we require a reference to a range that contains dates, therefore numeric values only. Firstly, if you intend on building a few dynamic named ranges where the ranges might contain numbers then you should set-up a name to store BigNum. Name: BigNum Refers To: =9.9999999999999E+307 Dynamic Range: Name: Holidays_Dates Refers To: =$A$2:INDEX($A:$A,MATCH(BigNum,$A:$A,TRUE),0) The A2 reference is considered static; it is the starting point of your Holidays range in all cases. We then use the INDEX method to yield the last reference in the range, using the MATCH function to perform a binary search of BigNum in column A. Since BigNum will always be larger than any date in the range, MATCH will yield the row index number of the last entry in the table. INDEX on its own would only return the last value in the table, but when bolted onto the starting range with a range operator it will refer to the last cell in the range. The result is a contiguous range spanning from the starting range reference all the way down to the last range reference in the populated table, namely A2:A12. Dynamic Ranges Text Only The method to yield a dynamic range for a table of text is the same as the method described above, only we substitute BigNum for BigText. BigText is usually setup as a string of two-hundered z characters. Naturally we need to choose a range of text values so we will be substituting column A (the date values) with column B (the holiday names). Name: BigText Refers To: =REPT(z,255) Dynamic Range: Name: Holiday_Dates Refers To: =$B$2:INDEX($B:$B,MATCH(BigText,$B:$B,TRUE),0)

Multi-Dimensional Dynamic Ranges


Multi-dimensional ranges are those that span more than one column and more than one row. Where these are required one must first decide upon a lead column, where row number is dynamic, or a lead row, where column number is dynamic. Formula: Holiday_Table Refers To: =$A$2:INDEX($B:$B,MATCH(BigNum,$A:$A,TRUE),0) Analysing the above, the start of the range is cell A2. It is then unioned (with the union operator) with a cell from column B (noted by the first argument of the INDEX function). The column of dates in column A has been chosen as the lead for the table height, noted by the second argument of the MATCH function.

Formula Methods in Excel Jon von der Heyden 2011

Page 57

6. Using Tables
A Table, referred to as Lists in prior versions, is a range of cells that holds data, with each row corresponding to an individual record (although the first row may be used to house field names). Tables are incredibly useful and are excellent as pivot table sources and for table sources upon which many array formulae will be dependent. They are dynamic and therefore spare the need for dynamic ranges in certain circumstances. By default, tables have Auto Filters enabled although these can be switched off. These filters can however be used in versions 2007 and later to order the data. Users must be aware, however that models cannot include tables and custom views. Users of tables should also familiarise themselves with table formula syntax. Assume the following Table, called Sales: A 1 2 3 4 DIVISIONS Real Estate IT Accounting Services Q1 2,500,000 980,000 345,000 B Q2 2,650,000 890,000 360,000 C Q3 2,550,000 915,000 365,000 D

Fortunately reference operators and such remain unchanged. The key change here is how we reference areas, or ranges, within the table. REFERENCE =Sales[#All] =Sales[#Data] =Sales[#Headers] =Sales[#Totals] =Sales[#ThisRow] MEANS
The entire table, including headers, data, and totals Data only The header row The totals row, if one exists The current row, according to the cursor position

RANGE EQUIVALENT A1:D4 A2:D4 A1:D1 [NULL] A2:D2 (assuming cursor is in row 2)

Table 6-1 Table reference syntax

One can also use calculations within Tables; and in doing so use the Table structured references. For instance, assuming you shish to add a totals column for Q1, Q2, Q3, called YearToDate; the following could be used: =SUM(Table1[@[Q1]:[Q3]]) To add totals to the bottom of the table, simply go to the next available row and enter your totals formula. Excel should be intelligent enough to interpret it as a totals row. One can also refer to the field names, either referencing them entirely, or in components. Let us study the following table: REFERENCE =Sales[[#All],[DIVISIONS]] =Sales[[#Headers],[DIVISIONS]] =Sales[[#Totals],[YearToDate]] =Sales[[Q1]:[Q3]] =Sales[[#Headers],[#Data],[Q1]:[Q3]] MEANS The entire DIVISIONS field The DIVISIONS header only The YearToDate subtotal All data between Q1 and Q3 Q1:Q3 headers and data RANGE EQUIVALENT A1:A5 A1 E5 B2:D4 B1:D4

The best way for one to familiarise oneself with Tables is to force oneself to use them! Give it a go!

Formula Methods in Excel Jon von der Heyden 2011

Page 58

7. Auditing Formula
Perhaps even trickier than writing formulae, is auditing formulae. Not only those written by others but even ones own formulae. Sometime six months down the line, or sometimes even two hours. The trick to being able to successfully understand ones own work later down the line is: Keep it simple. Its very tempting to make things look a lot more complicated than they are! Use names! And use a descript naming convention. Keep a hidden sheet with notes that you can revisit later. Break down long formulae into many lines at relevant intervals. Hitting ALT+ENTER in formula entry mode allows continuation of the formula onto a new line.

So what process should one follow when auditing formulae? There are plenty of different methods and tools, but in reality the only truly effective process is as follows: 1. Briefly attempt to resolve the formula simply by looking at it and. You should recognise fairly quickly whether or not you can resolve the formula without further assistance. 2. If the formula is complex, are there notes on it in some sort of log or audit sheet? If so that should be your first port of call. 3. Does the formula reference names? If so study each name in the name manager. Perhaps jot them down on a piece of paper with a note on what each refers to. 4. Can you see a logical order in which to study the formula? If so, break the formula down onto multiple lies by inserting carriage returns at the relevant positions (place your cursor in the formula bar and hit ALT+ENTER). 5. If the formula is complex, does it contain large range references (arrays of 20+ items becomes difficult to tack with this method)? If not invoke Evaluate Formula. This utility is in the formulae tab in Excel 2007 and 2010.

a. Focus on the underlined argument and click Evaluate. b. Excel will determine the order to resolve this and it will resolve each statement one step at a time, overwriting the statement with constant values (or inline array constants for arrays). c. Click Evaluate at the end of each interval, once you have satisfied yourself with the result of the current step. d. Click Step In on any argument that you are unsure about.

Formula Methods in Excel Jon von der Heyden 2011

Page 59

6. For relatively simple formula you can manually emulate step 2 as follows: a. Highlight the formula component that you wish to evaluate.

b. Hit F9. The component will be resolved. c. Repeat these steps until you fully resolve the formula. 7. If the formula is complex and does reference large arrays, consider making a copy and manually reducing the size of the arrays. Once done go through the steps described in item #5. If you are trying to track and error: 1. Use the Evaluate Formula utility described above. 2. Step 1 may reveal that the error is a knock-on effect of an error in a precedent. Highlight the precedent ranges and seek out the errors. It is always best to handle errors at the source. a. Hit F5 > Special and choose Precedents > Direct only (note this will only select ranges on the current worksheet).

b. Invoke Go To Special again, and this time seek out errors by choosing formulae > Errors. Note that errors can be constant values too so be sure to test Constants > Errors

Formula Methods in Excel Jon von der Heyden 2011

Page 60

c. Step (b) will select any ranges that do contain any error values. If these are numerous consider highlighting them in a different fill colour, so that they are easy to revisit and correct. If after the audit process you decide that the formulae were rather difficult to get to grips with, dont assume that you will struggle any less the next time. Take the time now to replace any repetitive references with names. Replace any constants with names. Consider breaking the formula down using helper columns, rows and cells. And finally consider logging a written explanation of the formula in an audit or log sheet.

Formula Methods in Excel Jon von der Heyden 2011

Page 61

8. Funky formulae
Here is a collection of some cool formulae. Even if you dont think it likely that you will ever need one of these formulae, study the solutions as they reveal many tricks of the trade. Get the Month Number of a Financial Year The MONTH function yields the month number based on the calendar year. Heres how to adjust it to conform to a financial year. Formula:
=MOD(MONTH(Date)-Start_Month_Num,12)+1

Assumptions: Date can be a reference to a range, name or date constant. If your Financial Year begins in e.g. October, Start_Month_Num will be 10. Get the Week Number of a Financial Year The WEEKNUM function yields a week number based on the calendar year. Heres how to adjust it to begin from a set date. Formula:
INT((Date-DATE(YEAR(Date+Days_Remaining),-Months_Remaining,-1))/7)

Assumptions: Date can be a reference to a range, name or date constant. Days_remaining refers to the number of days from the start date until the end of the year. For instance, if adjusted to work with the UK tax year (beginning 06 April), this would be 270, the # of days between 06 April and 01 January the following year. Months_Remaining refers to the number of months remaining from the start date until the end of the year. For instance, if adjusted to work with the UK tax year this would be 8, the # of full months remaining between 06 April and 01 January the following year. Repeat Each Item in a Table n Times Assume you have a range of values, or an array. How do you repeat each item in that range n times? Formula:
=INDEX(Table,INT((ROWS($1:1)-1)/n)+1,1)

Assumptions: Table can be a reference to a range, name or array constant. The table is in a row-based vector (or array). And if the table is an array or range, the repeating value is sourced from the 1st column.

Formula Methods in Excel Jon von der Heyden 2011

Page 62

Repeat a Table n Times Similar to the previous listing; assume you have a range of values, or an array. How do you repeat the range of values n times? Formula (copied down):
=IF(ROWS($1:1)<=n*COUNTA(Table),INDEX(Table,MOD(ROWS($1:1)-COUNTA(Table)-1,COUNTA(Table))+1,1),"")

Assumptions: Table can be a reference to a range, name or array constant. The table is in a row-based vector, array or range. The table does not contain any blanks within its first column. And if the table is an array or range, the repeating value is sourced from the 1st column. Get the nth Element from a String based on a given Delimiter Assume you have a text string that is a concatenation of sub-strings, separated by a delimiter. How do you retrieve the nth element? Formula (copied down):
=TRIM(MID(SUBSTITUTE(String,Delim,REPT(" ",255)),255*(n-1),255))

Assumptions: String and Delim can be references to a range, name or constant. Delim should be a single character delimiter. The sub-string element is no longer than 255 characters. 3-Dimensional SUMIF Did you know you can emulate multi-dimensional table by utilising separate worksheets as the 3rd dimension? Be aware that this formula is volatile. Formula:
=SUM(SUMIF(INDIRECT(Sheet & {1,2,3} & "!A:A"),= & criterion ,INDIRECT(Sheet & {1,2,3} & "!B:B")))

Assumptions: The references are all stored in 3 worksheets called Sheet1, Sheet2 and Sheet3 The criteria_range is in column A and the sum_range in column B in each sheet. Criterion can be a reference to a range, name or constant. Multi-Criteria Lookups Functions such as VLOOKUP, HLOOKUP, MATCH and LOOKUP only support a single criterion. What if you have more than one criterion? Formula:
{=VLOOKUP(Criteria1 & "|" & Criteria2,CHOOSE({1,2},Criteria1_Range & "|" & Criteria2_Range,Result_Range),2,0)}

Assumptions: Criteria1 and Criteria2 can be references to ranges, names or constants. Criteria1_Range, Criteria2_Range and Result_Range must be range references. Criteria1_Range, Criteria2_Range and Result_Range have equal size dimensions.

Formula Methods in Excel Jon von der Heyden 2011

Page 63

Vlookup returning Multiple Results What happens when a lookup value appears more than once in a lookup table? Excel will only yield a single result. Is it possible to yield each result? Formula (copied down): {=INDEX(Result_Range,MATCH(1,(Criteria_Range=Criteria)*ISERROR(MATCH(Result_Range,Results,0)),0))} Assumptions: Results is a relative reference to the range of results, offset for -1 row. In other words; if the cell is entered in A2, Results should be a reference to $A$1:$A1. Variable Discounting using Differential Rates Often commissions, tax and discounts have rates that depend on the transaction value. Lets assume a given order attracts a discount, the discount given is applied at different percentages at different order value intervals. Formula:
=SUMPRODUCT(--(Amount>Threshold), (Amount-Threshold), Differential_rate)

Assumptions: Amount is the order total, and can be a reference to a range, name or constant. Threshold is a 1-dimensional array, range or named array of order value brackets (e.g. {6000;20000;50000;60000}) Differential_rate is a 1-dimensional array, range or named array of differential discount rates (e.g. {0.17;0.13;0.12;0.05}) Extract Numbers from an Alpha-numeric String You have a string of alpha-numeric values. How can you extract all the numbers into a single numeric result? Formula:
=-SUMPRODUCT(-MID(0&String, LARGE(INDEX(ISNUMBER(-MID(String, ROW(INDIRECT("1:"&LEN(String))),1))* ROW(INDIRECT("1:"&LEN(String))),0),ROW(INDIRECT("1:"&LEN(String))))+1,1), 10^ROW(INDIRECT("1:"&LEN(String)))/10)

Assumptions: String can be a reference to a range, name or constant. Extract a Date from a Text String Extracting dates from within strings is made easy by virtue that dates are stored as numerical values. Formula:
=-LOOKUP(9.9999999999999E+307,-MID(String,ROW(INDIRECT("1:"&LEN(String))),10))

Assumptions: String can be a reference to a range, name or constant. The date must conform to a valid date format, albeit it is embedded within a string. Change the 10 length value above. This rendition expects a 10 character date (e.g. dd/mm/yyyy)

Formula Methods in Excel Jon von der Heyden 2011

Page 64

Calculate the Last Used Row in a Column (useful for Dynamic Ranges) In the Dynamic Ranges chapter, methods were described to calculate last used rows in ranges for a specified data type (i.e. numbers or text). The following formula will return the last used row whether the last entry be a number or text value. Formula:
{=MAX(IF(ISNUMBER(CHOOSE({1;2},MATCH(9.9999999999999E+307,Column,1),MATCH(REPT("z",255), Column,1))), CHOOSE({1;2},MATCH(9.9999999999999E+307, Column,1),MATCH(REPT("z",255), Column,1))))}

Assumptions: Column should be a reference to a range, typically a whole column. Locate a Break-Even Point The way of identifying a break-even point it by identifying the last time a spread was equal to larger than zero. Formula:
=LOOKUP(1,1/(Values<0),INDEX(Dates,0,2))

Assumptions: Values and Dates are 1-dimensional column-based ranges or arrays. Each Value has a corresponding Date.

Formula Methods in Excel Jon von der Heyden 2011

Page 65

9. Shortcuts
Control Keys
COMBINATION CTRL+PgUp CTRL+PgDn CTRL+SHIFT+( CTRL+SHIFT+& CTRL+SHIFT_ CTRL+SHIFT+~ CTRL+SHIFT+$ CTRL+SHIFT+% CTRL+SHIFT+^ CTRL+SHIFT+# CTRL+SHIFT+@ CTRL+SHIFT+! CTRL+SHIFT+: CTRL+SHIFT+" CTRL+SHIFT+Plus (+) CTRL+Minus (-) CTRL+; CTRL+` CTRL+' CTRL+1 CTRL+2 CTRL+3 CTRL+4 CTRL+5 CTRL+6 CTRL+8 CTRL+9 CTRL+0 CTRL+A ACTION Switches between worksheet tabs, from left-to-right. Switches between worksheet tabs, from right-to-left. Unhides any hidden rows within the selection. Applies the outline border to the selected cells. Removes the outline border from the selected cells. Applies the General number format. Applies the Currency format with two decimal places (negative numbers in parentheses). Applies the Percentage format with no decimal places. Applies the Scientific number format with two decimal places. Applies the Date format with the day, month, and year. Applies the Time format with the hour and minute, and AM or PM. Applies the Number format with two decimal places, thousands separator, and minus sign (-) for negative values. Enters the current time. Copies the value from the cell above the active cell into the cell or the Formula Bar. Displays the Insert dialog box to insert blank cells. Displays the Delete dialog box to delete the selected cells. Enters the current date. Alternates between displaying cell values and displaying formulae in the worksheet. Copies a formula from the cell above the active cell into the cell or the Formula Bar. Displays the Format Cells dialog box. Applies or removes bold formatting. Applies or removes italic formatting. Applies or removes underlining. Applies or removes strikethrough. Alternates between hiding and displaying objects. Displays or hides the outline symbols. Hides the selected rows. Hides the selected columns. Selects the entire worksheet. If the worksheet contains data, CTRL+A selects the current region. Pressing CTRL+A a second time selects the entire worksheet. When the insertion point is to the right of a function name in a formula, displays the Function Arguments dialog box. CTRL+SHIFT+A inserts the argument names and parentheses when the insertion point is to the right of a function name in a formula. Applies or removes bold formatting. Copies the selected cells.

CTRL+B CTRL+C

Formula Methods in Excel Jon von der Heyden 2011

Page 66

CTRL+D CTRL+F

Uses the Fill Down command to copy the contents and format of the topmost cell of a selected range into the cells below. Displays the Find and Replace dialog box, with the Find tab selected. SHIFT+F5 also displays this tab, while SHIFT+F4 repeats the last Find action. CTRL+SHIFT+F opens the Format Cells dialog box with the Font tab selected. Displays the Go To dialog box. F5 also displays this dialog box. Displays the Find and Replace dialog box, with the Replace tab selected. Applies or removes italic formatting. Displays the Insert Hyperlink dialog box for new hyperlinks or the Edit Hyperlink dialog box for selected existing hyperlinks. Displays the Create Table dialog box. Creates a new, blank workbook. Displays the Open dialog box to open or find a file. CTRL+SHIFT+O selects all cells that contain comments. Displays the Print tab in Microsoft Office Backstage view. CTRL+SHIFT+P opens the Format Cells dialog box with the Font tab selected. Uses the Fill Right command to copy the contents and format of the leftmost cell of a selected range into the cells to the right. Saves the active file with its current file name, location, and file format. Displays the Create Table dialog box. Applies or removes underlining. CTRL+SHIFT+U switches between expanding and collapsing of the formula bar. Inserts the contents of the Clipboard at the insertion point and replaces any selection. Available only after you have cut or copied an object, text, or cell contents. CTRL+ALT+V displays the Paste Special dialog box. Available only after you have cut or copied an object, text, or cell contents on a worksheet or in another program. Closes the selected workbook window. Cuts the selected cells. Repeats the last command or action, if possible. Uses the Undo command to reverse the last command or to delete the last entry that you typed.

CTRL+G CTRL+H CTRL+I CTRL+K CTRL+L CTRL+N CTRL+O CTRL+P CTRL+R CTRL+S CTRL+T CTRL+U CTRL+V

CTRL+W CTRL+X CTRL+Y CTRL+Z

Formula Methods in Excel Jon von der Heyden 2011

Page 67

Function Keys
KEY F1 ACTION Displays the Excel Help task pane. CTRL+F1 displays or hides the ribbon. ALT+F1 creates an embedded chart of the data in the current range. ALT+SHIFT+F1 inserts a new worksheet. Edits the active cell and positions the insertion point at the end of the cell contents. It also moves the insertion point into the Formula Bar when editing in a cell is turned off. SHIFT+F2 adds or edits a cell comment. CTRL+F2 displays the print preview area on the Print tab in the Backstage view. Displays the Paste Name dialog box. Available only if there are existing names in the workbook. SHIFT+F3 displays the Insert Function dialog box. Repeats the last command or action, if possible. When a cell reference or range is selected in a formula, F4 cycles through all the various combinations of absolute and relative references. CTRL+F4 closes the selected workbook window. ALT+F4 closes Excel. Displays the Go To dialog box. CTRL+F5 restores the window size of the selected workbook window. Switches between the worksheet, ribbon, task pane, and Zoom controls. In a worksheet that has been split (View menu, Manage This Window, Freeze Panes, Split Window command), F6 includes the split panes when switching between panes and the ribbon area. SHIFT+F6 switches between the worksheet, Zoom controls, task pane, and ribbon. CTRL+F6 switches to the next workbook window when more than one workbook window is open. Displays the Spelling dialog box to check spelling in the active worksheet or selected range. CTRL+F7 performs the Move command on the workbook window when it is not maximized. Use the arrow keys to move the window, and when finished press ENTER, or ESC to cancel. Turns extend mode on or off. In extend mode, Extended Selection appears in the status line, and the arrow keys extend the selection. SHIFT+F8 enables you to add a nonadjacent cell or range to a selection of cells by using the arrow keys. CTRL+F8 performs the Size command (on the Control menu for the workbook window) when a workbook is not maximized. ALT+F8 displays the Macro dialog box to create, run, edit, or delete a macro. Calculates all worksheets in all open workbooks. SHIFT+F9 calculates the active worksheet. CTRL+ALT+F9 calculates all worksheets in all open workbooks, regardless of whether they have changed since the last calculation. CTRL+ALT+SHIFT+F9 rechecks dependent formulae, and then calculates all cells in all open workbooks, including cells not marked as needing to be calculated. CTRL+F9 minimizes a workbook window to an icon.

F2

F3

F4

F5 F6

F7

F8

F9

Formula Methods in Excel Jon von der Heyden 2011

Page 68

F10

Turns key tips on or off. (Pressing ALT does the same thing.) SHIFT+F10 displays the shortcut menu for a selected item. ALT+SHIFT+F10 displays the menu or message for an Error Checking button. CTRL+F10 maximizes or restores the selected workbook window. Creates a chart of the data in the current range in a separate Chart sheet. SHIFT+F11 inserts a new worksheet. ALT+F11 opens the Microsoft Visual Basic For Applications Editor, in which you can create a macro by using Visual Basic for Applications (VBA). Displays the Save As dialog box.

F11

F12

Formula Methods in Excel Jon von der Heyden 2011

Page 69

10.

Limitations Table
LIMIT (EXCEL 2010) Limited by available memory and system resources 1,048,576 rows by 16,384 columns 255 characters 409 points 1,026 horizontal and vertical Limited by available memory (default is 3 sheets) 16 million colors (32 bit with full access to 24 bit color spectrum) Limited by available memory 64,000 Between 200 and 250, depending on the language version of Excel that you have installed Limited by available memory Limited by available memory 4 Limited by available memory Limited by available memory; a summary report shows only the first 251 scenarios 32 200 Limited by available memory 10 percent to 400 percent Limited by available memory LIMIT (EXCEL 2007) Limited by available memory and system resources 1,048,576 rows by 16,384 columns 255 characters 409 points 1,026 horizontal and vertical Limited by available memory (default is 3 sheets) 16 million colors (32 bit with full access to 24 bit color spectrum) Limited by available memory 64,000 Between 200 and 250, depending on the language version of Excel that you have installed Limited by available memory Limited by available memory 4 Limited by available memory Limited by available memory; a summary report shows only the first 251 scenarios 32 200 Limited by available memory 10 percent to 400 percent Limited by available memory LIMIT (EXCEL 2003) Limited by available memory and system resources 65,536 rows by 256 columns 255 characters 409 points 1,000 horizontal and vertical Limited by available memory (default is 3 sheets) 56

FEATURE Open workbooks

Worksheet size Column width Row height Page breaks Sheets in a workbook

Colors in a workbook

Named views in a workbook Unique cell formats/cell styles Number formats in a workbook

Limited by available memory 4,000 Between 200 and 250, depending on the language version of Excel that you have installed Limited by available memory Limited by system resources 4 Limited by available memory Limited by available memory; a summary report shows only the first 251 scenarios 32 200 Limited by available memory 10 percent to 400 percent Limited by available memory

Names in a workbook Windows in a workbook Panes in a window Linked sheets Scenarios

Changing cells in a scenario Adjustable cells in Solver Custom functions Zoom range Reports

Formula Methods in Excel Jon von der Heyden 2011

Page 70

Sort references

Undo levels Fields in a data form Items displayed in filter drop-down lists Noncontiguous cells that can be selected Number precision Smallest allowed negative number Smallest allowed positive number Largest allowed positive number Largest allowed negative number Largest allowed positive number via formula Largest allowed negative number via formula Length of formula contents Internal length of formula Iterations Worksheet arrays Selected ranges Arguments in a function Nested levels of functions Earliest date allowed for calculation Latest date allowed for calculation Largest amount of time that can be entered

64 in a single sort; unlimited when using sequential sorts 100 32 10,000 2,147,483,648 cells 15 digits -2.23E-308 2.23E-308 1.00E+308 -1.00E+308 1.7976931348623158e+3 08 1.7976931348623158e+3 08 8,192 characters 16,384 bytes 32,767 Limited by available memory 2,048 255 64 January 1, 1900 (January 1, 1904, if 1904 date system is used) December 31, 9999 9999:59:59

64 in a single sort; unlimited when using sequential sorts 100 32 10000 2,147,483,648 cells 15 digits -2.2251E-308 2.229E-308 1.79769313486231E+30 8 -1E-307 1.7976931348623158e+ 308 1.7976931348623158e+ 308 1,024 characters 16,384 bytes 32,767 Limited by available memory 2,048 30 7 January 1, 1900 (January 1, 1904, if 1904 date system is used) December 31, 9999 9999:59:59

3 in a single sort; unlimited when using sequential sorts 16 32 1000 8,192 cells 15 digits -2.2251E-308 2.229E-308 1.79769313486231E+308 -1E-307 1.7976931348623158e+3 08 1.7976931348623158e+3 08 1,024 characters 16,384 bytes 3,2767 Limited by available memory 2,048 30 7 January 1, 1900 (January 1, 1904, if 1904 date system is used) December 31, 9999 9999:59:59

Formula Methods in Excel Jon von der Heyden 2011

Page 71

Potrebbero piacerti anche