Exact Match Problems

Matching across two separate data files containing full name and full postal information with the intent of finding match records for the purpose of appending data including email address is a very complex problem. To address this problem correctly requires a combination of high-powered software, high ethical standards and respect and understanding of industry best practices.

When just dealing with the software portion of the equation it should be understood that the software required to carry out this endeavor is expensive, complex and multi-faceted, being made up of several software components. These components include software for Databases, file analysis, file manipulation and file merge-purge, just to generally name a non-comprehensive few. When talking with clients, most believe all that is required to correctly compare and append information such as email to their house file is some merge-purge software. Indeed some companies that purport to do Email Appending only use the elementary merge-purge capability associated with SQL or Oracle software. Of course these operational approaches to Email Appending do exist and hurt the Industry. This is like putting a lawnmower engine on a steel beam and calling it a racecar. Without investment in a body, wheels, transmission and steering the engine will not provide any quality outcomes and will not win any races. Furthermore if enough people enter lawnmower engines on steel beams into car races it will impact the credibility of the auto racing industry.

That being said, let's look at the basic engines for merge-purge. Some folks use the elementary database engines described above. Some folks try to build their own engines. Other companies use “off-the-shelf” engines developed by software companies like Group1, First Logic and others. Still other companies use a hybrid of “off-the-shelf” and internally developed approaches. AcquireWeb falls into this later category. Most quality engines allow for a variety of match stringency settings. These settings (exact, tight, medium and loose) create the basis for the goodness of fit criteria between two separate pieces of data to be called a match. The highly ethical companies use a tight-match stringency in their match algorithm. The majority of append companies who do email appending use a medium stringency algorithm for their projects to get more matches. What surprises most clients is that the highly ethical companies don’t use an exact match algorithm in their email append stringency. The fact is that comparing two separate data files containing full name and full postal data is extremely complex and rarely does the exact name and exact address match.

A simple example is just to look at first name; Robert, _Robert, robert, Bob, bobby, Bobby, _Bobby, _bobby, Bob, BOB, _BOB, bob, _Bob, _robert, ROBERT, _ROBERT, rob, _rob, Rob, _Rob, ROB, _ROB, Robby, _Robby, robby, _robby, ROBBY, _ROBBY are 28 examples of one name including logical extensions and including one common data entry variable of starting the name at the beginning of a data field or starting the name after first space in a data field. There exist many different data entry variables, as you can imagine.

It must be understood that most people with common first names tend to use more than one extension of their name depending on the circumstance. For example, they may use their full formal name “Robert” when registering to vote or to return their taxes. They may use another less formal extension of their name “Bob” when signing up to join a sports league. Hypothetically, if you are looking for Robert Smith and comparing a voter registration file with a sports league and using exact match stringency in your merge-purge algorithm the result will return NO MATCH. This outcome occurs even if all the other data elements including; last name, address 1, address 2, city, state, zip, zip+4 and Phone are exact matches including data entry variables. That is because Exact means Exact. Thus comparing a 1 million record customer file against a 95 million records opt-in email file using “Exact Match Stringency” could return 1,000 +/- 1,000 matches. This is a sub-optimal outcome considering on average 150,000 good and true matches may exist in the file. Likewise, using medium or loose match stringency could return a match for Rebecca Smith, Ronald Smith, Reginald Smith, Randy Smith all of whom may not live in the same town or zip code as the Robert Smith you are looking for.

There is no single match stringency that will return all 150,000 good and true matches from the example above and only 150,000 matches. The trick is to return as many of the 150,000 good matches as possible while limiting the number of inaccurate matches. Acceptable limits for inaccurate matching should be less than 0.5%. The most conservative approach is to carry out matching using tight match logic. In the better systems this allows for logical extensions in first names, and a variety of data entry variables and slight data entry errors. Of course, putting a finely tuned 600 horsepower engine on a steel beam and entering it into the Indy-500 won’t win any races. You still need a great body, wheels, transmission…. Likewise, having a great merge-purge engine will not result in optimal outcomes for the client, unless there is great supporting software in addition to high ethical standards and understanding of industry best practices. AcquireWeb brings all the components together in one package to deliver the most, High-Quality matches in the industry. For a test drive visit www.acquireweb.com or call us at 650-212-2233.

Albert Gadbut
President
AcquireWeb, Inc.
2003


Isn't it time to decide which acquisition services will help you build and manage your opt-in customer database?