When you join table columns, you no longer require and exact match. Fuzzy matching lets you compare items in separate lists and join them if they're close to each other. You can even set the matching tolerance, or Similarity Threshold.
A common use case for fuzzy matching is with freeform text fields, such as in a survey where the question of your favorite fruit might have typos, singulars, plurals, uppercase, lowercase and other variations that are not an exact match.
Fuzzy matching is only supported on merge operations over text columns. Power Query uses the Jaccard similarity algorithm to measure the similarity between pairs of instances.
Procedure
-
To open a query, locate one previously loaded from the Power Query Editor, select a cell in the data, and then select Query > Edit. For more information see Create, edit, and load a query in Excel (Power Query).
-
Select Home > Combine > Merge Queries. You can also select Merge Queries as New. The Merge dialog box appears with the primary table at the top.
-
Select the column you want to use for your fuzzy match. In this example, we select First Name.
-
From the drop-down list, select the secondary table, and then select the corresponding fuzzy match column. In this example, we select First Name.
-
Select a Join Kind. There are several different ways to join. Left Outer is the default and the most common. For more information on each kind of join, see Merge queries.
-
Select Use fuzzy matching to perform the merge, select Fuzzy matching options, and then select from the following options:
-
Similarity Threshold Indicates how similar two values need to be in order to match. The minimum value of 0.00 causes all values to match each other. The maximum value of 1.00 only allows exact matches. The default value is 0.80.
-
Ignore case Indicates whether text values should be compared in a case sensitive or insensitive manner. The default behavior is case insensitive, which means case is ignored.
-
Maximum number of matches Controls the maximum number of matching rows that will be returned for each input row. For example, if you only want to find one matching row for each input row, specify a value of 1. The default behavior is to return all matches.
-
Transformation table Specify another query that holds a mapping table, so that some values can be auto-mapped as part of the matching logic. For example, defining a two-column table with a “From” and “To” text columns with values “Microsoft” and “MSFT” will make these two values be considered the same (similarity score of 1.00).
-
-
Power Query analyzes both tables, and displays a message about how many matches it made. In the example, the selection matches 3 of 4 rows from the first table. Without using fuzzy matching, only 2 of 4 rows would match.
-
If you're satisfied, select OK. If not, try different Fuzzy merge options to customize your experience.
-
When satisfied, Select OK.