Relation for Excel
by Matthias Bürcher @ 2016-2022 Version 3.0
This macro provides functions to create simple relational algebra.
Unlike other Excel solutions, this one is purely functional and does not use macros or external DLL.
The functions all return relations. Relations are stored as text in one cell with double colon (::) as field separator and space+newline as row separator. The first line are the column names.
Relation for Excel 3.0 adds SQL syntax. You can now make SQL queries in Excel using tables in any range in a worksheet, including joins between tables.
The zip file contains:
- relation-for-excel-300.xlsm: An Excel spreadsheet with examples
Version 3.0 is only distributed as exampel worksheet (xlsm), not as add-in (xlam). To use it, remove the example tabs or copy the code, which is in one single self contained module in VBA, into your spreadsheet. You can create an add-in from it, if you like.
This presentation will give you a good introduction to relational algebra and how to use it with these functions
Version 3.0 adds the following functions:
- relSql(code, range1, range2 ...): the function you will use
- relParseSql(code): a helper function that displays the relation instructions created from an SQL query
- relTokenizeSql(code): a helper function that displays the tokens of an SQL query
Some valid SQL queries are:
- SELECT * FROM t1
- SELECT CustomerName, City FROM t1
- SELECT * FROM t1 WHERE City='Berlin' OR City='München'
- SELECT country, POW(2,3) AS c FROM t1
- SELECT Country, COUNT(Country) AS n FROM t1 WHERE Country LIKE '%land%' HAVING n < 2 ORDER BY Country
- SELECT t1.OrderID, t2.CustomerName, t1.OrderDate FROM t1 NATURAL JOIN t2
- SELECT t1.CustomerName AS p, t2.OrderID AS ok FROM t1 JOIN t2 ON t1.CustomerID = t2.CustomerID ORDER BY p
- Valid statements are SELECT AS FROM NATURAL LEFT RIGHT OUTER JOIN ON WHERE HAVING ORDER BY and the must be written in uppercase
- Ranges are designated t1 through t9 and must be provided as arguments in the function.
- Expressions are allowed in columns, but they must have a qualified name using AS.
- Functions and aggregators can be mixed, but there can be only one aggregator.
- Texts use single quotes
- Columns are not typed: operator context defines type. When comparing, text is bigger than number that is bigger than empty.
- As with relational algebra working on sets, results are always distinct and grouping is auto, i.e. there is no statement GROUP BY
Operators: + - * / > < >= <= = <> LIKE IN
Aggregators: AVG COUNT MAX MEDIAN MIN STDEV SUM
Numerical functions: ABS COS EXP INT LN LOG MOD POW ROUND SGN SIN SQRT TAN
Text functions: LEFT LEN LOWER MID REPLACE RIGHT TRIM UPPER
This was a summer project. For the programming challenges see Parsing SQL for Excel
For the documentation of the other functions see Relation for Excel 2.3
Old presentation relation-for-excel-200-presentation.pdf
21.8.2022 Relation for Excel 3.0
- SQL function
- Handling of multiline cells and long intermediate results (hash)
20.8.2020 Relation for Excel 2.3
- Hash join improves dramatically performance of natural join.
- Hash for duplicate elimination improves performance.
4.4.2017 Relation for Excel 2.0
- Complete rewrite on named relations.
15.8.2016 Relation for Excel 1.1.2
- relCell and relRow do not return error any more, but empty strings, when out of range
15.6.2016 Relation for Excel 1.1.1
- relProject gives no results for sum and count even if there is no row
- relParseExpression: In expressions with more than 10 columns, you do not need to preceed any more the 1-9 columns with a 0 (substution starts top, so there is no ambiguity)
10.5.2016 Relation for Excel 1.1
- Fixed several small bugs
- Redefined swFilter as unlimited parameters
20.4. Relation for Excel 1.0