Accu360 Blog

What Are The Best Practices For Data Silo Integration?

[fa icon="calendar"] Oct 19, 2020 2:42:38 PM / by Thierry CHRIN

Thierry CHRIN

You know that your customers want to be delighted at every location and channel. Unfortunately, it can be difficult to create a consistent experience because customer data is stored in so many different sources.

The cause may be that you have fragmented or siloed e-commerce and brick-and-mortar data. Databases in each country or geography may store customer data in different formats to match local customs. Often, different brands or lines of business will have their own customer data.

These data silos are nothing to be ashamed of.  They occur naturally when a division automates operations before a global plan is created.  The problem is that well-established and independently designed databases are difficult to link together.

The best practice for integrating data from different customer data sources is to concentrate on resolving customer identities. Once you find the way to ensure that identities match across your data silos, the rest of the problem revolves around linking the sources. Then you will be able to assemble a complete picture of the customer for analysis and quality interactions.

The best practices for integrating data silos fall into the following steps:

  1. Data gathering
  2. Data parsing
  3. Postal address hygiene
  4. Email address, phone number, and other data hygiene (if available)
  5. Matching and merging (duplicate reconciliation)
  6. Managing metadata
  7. Creating a system of reference

 

1. Data Gathering

The first step in the process of resolving a person’s identity is to gather all of the person’s data. Taking an inventory to determine where data is stored often involves many parts of an organization to identify potential source files.

Examples of places where data may be found are the following:

  • Marketing: Website forms, email lists, trade show leads, etc.
  • Accounting: Billing, receiving, credit reports, etc.
  • Shipping and fulfillment records
  • Sales: CRM systems, contact lists, etc.
  • Customer Support: Service requests, after sales support.

It is a good idea to note how names, addresses, and phone numbers are stored and the character set used.

 

2. Data Parsing

Once the data is gathered, you will likely find that the structure of each database is different. For global brands, you will likely find that names, addresses, and phone numbers will be stored differently. In some countries, the family name is first, and in others it is last. Addresses may be entered according to local postal regulations.

The best practice is to parse information into its component attributes, which involves breaking up the data into sub units. Examples of attributes used for addresses are as follows: Unit/Apartment/Flat, Premises, Number, Block Sequence, Street, City, District, Town, Postal Code, and Country.

 

Data Parsing Example

Here's an example of data parsing on a Western postal address:

Original Data:

23 DAVID PLACE
ST HELIER JE2 4TE

Parsed Data:

Number: 23
Street: David Place
City/Town: St Helier
Postal Code: JE2 4TE

 

Data Parsing Example with a Non-Roman Character Set

An individual may write an address in Japan as follows:

北海道札幌市東区北二十四条東3-3-1

The source information in a Japanese addressing system would be parsed as follows:

Block Sequence: 3-3-1
Area Name: 北二十四条東
District: 東区
City/Town: 札幌市
Prefecture: 北海道
Postal code: 065-0024

The same address would Romanized for a Western copy of the database:

Block Sequence: 3-3-1
Area Name: Kita-24 Johigashi
District: Higashi-Ku
City/Town: Sapporo-Shi
Prefecture: Hokkaido
Postal Code: 065-0024

 

3. Address Hygiene

Address hygiene involves cleansing, standardizing, and validating address information. Hygiene may also involve internationalization to handle multiple languages and character sets. The standardization process involves the following steps:

Components, such as street names and attributes, are corrected and put in official, standardized formats.  For example, in an English-speaking country the term “Lane” might be changed to the standardized abbreviation “LN”. An abbreviation for the US state Connecticut that is sometimes used is Conn., however, it would be standardized to “CT” for all Connecticut records. Potentially fake and vulgar names are flagged, and common problem phrases are identified in the name or address fields, such as phrases like “Do Not Mail” and “Unknown” in multiple languages.

Validation involves comparing a parsed address with official records and making corrections.  Best practice would be to perform the following operations:

  • Determine or validate the country
  • Match to Postcode Address File (PAF) reference data for the specific country, which includes officially licensed sources such as the country’s postal authority, other government agencies, and third parties.
  • Correct standardized identified components
  • Append/insert missing components

While licensed and up-to-date PAF data may cost more than other sources, they can vastly improve accuracy and quality. It is a good idea to ask vendors whether they use officially licensed sources.

It's considered best practice to assign a validity code or score to every address. The code indicates the accuracy of the address, specifies whether or not the address was corrected, and classifies the address as either deliverable, undeliverable or uncorrectable. This allows for pinpoint accuracy in determining which addresses can be successfully delivered to your prospects.

Internationalization makes it possible to compare records entered in different character sets.  The best address hygiene tools are the market, such as Accu360's global address validation system, cover more than 20 languages and nine character sets, including Traditional and Simplified Chinese, Cyrillic, Greek, Japanese, Korean, and Western European (Roman/Latin). The best hygiene systems combine local language and Western-character capabilities in a single platform, allowing for customer data matching and merging across different source systems and data silos. Also, the best global hygiene tools can localize address spellings, formats, and diacritics or accent characters, giving your global customer data greater consistency and accuracy.

 

4. Email Address and Phone Number Hygiene

Email addresses and phone numbers may add valuable information when used with names to resolve an individual’s identity. As with postal addresses, a critical step is hygiene.

Phone numbers may be corrected using the following steps:

  • Parse to identify components.
  • Remove illegal characters.
  • Identify or insert country code and local area code.
  • Validate against a reference database.
  • Format phone number.

The following steps may be used to cleanse a customer’s email address:

  • Check the email format for compliance with internet standards (RFC 2822).
  • Parse address into user and domain.
  • Correct common domain errors.
  • Search Domain Name System (DNS) to confirm that the domain exists.

 

5. Matching and Merging (Duplicate Reconciliation)

Standardized records rarely match exactly. Personal data is always changing. People move. Individuals may provide a work phone number during one interaction and a mobile number during another interaction. Many people also have multiple addresses (e.g., primary and vacation homes), multiple phone numbers (e.g., fixed landline and mobile), and multiple email addresses (e.g., personal and work).

One might think that the goal would be to de-duplicate records by removing duplicate transactions or to change the original records. However, that is not best practice. It is important to keep the original records intact. It might be necessary to reconstruct the events that took place at a certain time. It is also critical to keep a detailed record of when privacy permissions were granted.

The objective of matching systems is to build a Golden Master Record that references the data of matched records. An important technique known as “cascading” helps to confidently build a complete Golden Master Record even when some attributes disagree and to fill in incomplete information.

 

6. Metadata

It is important that the Golden Master Record contain additional information (“metadata”) for each attribute or group of attributes to know what is the most current information, the quality of the source, and more.

The source databases must be considered when building the metadata for the Golden Master Record as well. A large conglomerate may want to have a Golden Master Record that covers the entire conglomerate. In other cases, subsidiaries may be considered as separate entities.

 

7. System of Reference

Once you have cleansed, matched, and merged your customer data, the best practice is to use a system of reference which will synchronize with your existing sources of data. Conceptually, it is good to think of a hub and spoke, with the system of reference at the center. When each spoke system has new data or needs new data, it can request it from the central hub. Only the data that needs updating is synchronized.

You can enhance source information when records are matched using a combination of algorithms and business rules.  For example, if one business unit has an updated shipping address for a customer who recently purchased, all of the business units can have access to the new and more accurate data.  (Of course you need to be careful to make sure that all privacy rights are followed.)

Not all of the data is copied to the system of reference. Typically, the system of reference will store only the most relevant data, such as account numbers and contact data, and also links to the source records. Because of these links, all information can be retrieved for analysis and building better customer experiences.

The best practice of creating a system of reference is much better than creating a system of record that is replicated across your organization. Replicated systems are often out of synchronization. The risk of not following best practice is that different systems will retrieve different results when a query is made, and it is possible to overwrite or change the current data to old data.

Another advantage with a system of reference is that it has a minimal impact on your existing operations, which speeds deployment and minimizes risk.

 

Conclusion

Following best practices will enable you to create consistent customer experiences across your organization.

Creating Golden Master Records with links to original sources is critical to implementing successful and useful Customer 360 and Single Customer View applications. Deep international experience with an understanding of data in each country increases the chance of success.

Topics: Customer Data Integration, Data Quality, Customer 360

Thierry CHRIN

Written by Thierry CHRIN

Thierry has led major transformation programs, both Back Office and Digital related, for global companies like Louis Vuitton, Carrefour, Estée Lauder and Clarins. In the last 10 years Thierry has built an international expertise in building Consumer Facing solutions.