SAP Hybris Marketing File-Based Load via SAP Cloud Platform Integration (HCI) – Key Observations

SAP Hybris Marketing offers many ways to integrate data from an external system. For instance, you can connect another SAP system like CRM, ERP or Cloud for Customer. Additionally, you can import data via an API or OData.

Another business scenario is to import data initially or regularly via a file using SAP Cloud Platform Integration services (HCI). This integration scenario allows you to create or update interactions, contacts, corporate accounts, or products and product categories in your SAP Hybris Marketing system. This is possible for cloud and on-premise systems, whereas a cloud system should be the most frequent case.

The new integration package named “SAP Hybris Marketing Cloud – file based data load” is available on the SAP API Business Hub. It runs on the SAP Cloud Platform (HCP) as an integration service tenant and connects to the SAP Hybris Marketing system via OData service. The CSV files are loaded from a SFTP server to SAP Cloud Platform Integration.

The content is delivered as an iFlow. An iFlow is a graphical tool to configure integration scenarios. With an iFlow, you can immediately see the complete end-to-end integration without the need to drill down:

* Who is the sender, who are the receivers and how many are they?
* Interfaces used with the sender and receivers.
* The adapters used with the sender and receivers.
* The dynamic routing used.
* The mappings used.

The following figure illustrates the integration flow for the object Interaction:

What happens in the iFlow for interactions?

The CSV files are stored on an SFTP server. The HCI fetches the CSV files.

* A script removes the header from the file.
* The CSV file is converted to XML format. With the help of the splitter you can split the file in smaller packages.
* The mapping from XML to OData structure is performed.
* The split packages are sent via OData to SAP Hybris Marketing.

In case of an error an e-mail can be sent, so you can avoid permanent monitoring.

Key Observations

We have tested only the iFlow for interactions and contacts.

Columns of the CSV Input File

In the migration package of “SAP Hybris Marketing Cloud – file based data load” you are provided with attached CSV sample files. With the help of these CSV files you can see, which fields are imported.

In a first test, we used the sample file for contacts and copied it to our SFTP server. The import worked properly.

In a second test, we created our own CSV file for contacts. As our contact data samples had no company assigned we have removed the columns SAPERPConsumerAccountId, CompanyId and CustomerName. This time the iFlow raised an error stating that XSD schema is incompatible with CSV payload.

The reason for the error: XML transformation expects all columns of the mapping.

Recommendation:

Open the sample CSV file for the corresponding object to understand, which columns/fields are required within the import file. The mapping is also explained in the attached Best Practise Document of the iFlow.

Validation of the Data within the iFlow

In further tests, we imported files with a higher data volume – 100000 data sets in one file and almost 1 million data sets.
In customer projects, data from many different external systems are imported to the SAP Hybris Marketing system. In our test scenario, we do not have such an external system, so we have created our own sample data for the import with our own tool.

With the generated sample data, some issues occurred regarding the XSD validation within HCI. For example, a value of a field was too long, see figure below.

Another issue was the wrong format of the field Timestamp. In such a case, you must open the CSV file on the SFTP server and correct the data inconsistencies. The best way is to check the consistency of your data in the CSV import file before you copy the file to the SFTP server.

Recommendation:

The rules with which the XML document must comply to be considered as “valid”, are set in the XSD file. If you are not clear about the length or format of field values, you can find the rules in the iFlow under Resources.

Structure of CSV Input File for Interactions

As already mentioned, the content of the CSV import file must comply with some conditions to pass the included validation. For the interaction CSV file, you must also ensure the validity of the structure within the CSV.

In our test case, we have imported the following CSV file for interactions:

ContactId
ContactIdOrigin
CommunicationMedium
InteractionType
Timestamp
CampaignId
InitiativeId
InitiativeVersion
Valuation
Reason
IsAnonymous
Amount
Currency
Latitude
Longitude
SourceObjectType
SourceObjectId
SourceObjectAdditionalId
SourceDataUrl
ContentTitle
ContentData
FirstName
LastName
CustomerName
EMailAddress
PhoneNumber
MobilePhoneNumber
IsContact
IsConsumer

TorgardTollefsen@rhyta.com
EMAIL
ONLINE_SHOP
SHOP_ITEM_VIEW
20170916220001

59369337
10702933

Torgard
Tollefsen
Alert Alarm Company
TorgardTollefsen@rhyta.com

TRUE

TorgardTollefsen@rhyta.com
EMAIL
ONLINE_SHOP
SHOP_ITEM_VIEW
20170916220002

59369337
10702933

Torgard
Tollefsen
Alert Alarm Company
TorgardTollefsen@rhyta.com

TRUE

TorgardTollefsen@rhyta.com
EMAIL
ONLINE_SHOP
PROD_REVIEW_VIEW
20170916220003

59369337
10702933

Torgard
Tollefsen
Alert Alarm Company
TorgardTollefsen@rhyta.com

TRUE

TorgardTollefsen@rhyta.com
EMAIL
ONLINE_SHOP
PROD_REVIEW_VIEW
20170916220033

59369337
10702933

Torgard
Tollefsen
Alert Alarm Company
TorgardTollefsen@rhyta.com

TRUE

TiagoSilvaCorreia@gustr.com
EMAIL
ONLINE_SHOP
SHOP_ITEM_VIEW
20170916220000

-19800431
-4464859

Tiago
Correia
Custom Lawn Care
TiagoSilvaCorreia@gustr.com

TRUE

TiagoSilvaCorreia@gustr.com
EMAIL
ONLINE_SHOP
PROD_REVIEW_VIEW
20170916220001

-19800431
-4464859

Tiago
Correia
Custom Lawn Care
TiagoSilvaCorreia@gustr.com

TRUE

TiagoSilvaCorreia@gustr.com
EMAIL
ONLINE_SHOP
SHOP_ITEM_ADD
20170916220031

-19800431
-4464859

Tiago
Correia
Custom Lawn Care
TiagoSilvaCorreia@gustr.com

TRUE

As a result, the Import Monitor app displays the following error:

The reason for this error:

At HCI one row within the CSV file is mapped to a contact and to an interaction. This means every row in the CSV file has a contact and an interaction, eventually during the import to the SAP Hybris Marketing system, a contact and an interaction is imported for each row in the CSV file.

So after the mapping within HCI the following contact data sets will be imported to SAP Hybris Marketing:

ContactId
ContactIdOrigin
FirstName
LastName
CustomerName
EMailAddress
PhoneNumber
MobilePhoneNumber
IsContact
IsConsumer

TorgardTollefsen@rhyta.com
EMAIL
Torgard
Tollefsen
Alert Alarm Company
TorgardTollefsen@rhyta.com

TRUE

TorgardTollefsen@rhyta.com
EMAIL
Torgard
Tollefsen
Alert Alarm Company
TorgardTollefsen@rhyta.com

TRUE

TorgardTollefsen@rhyta.com
EMAIL
Torgard
Tollefsen
Alert Alarm Company
TorgardTollefsen@rhyta.com

TRUE

TorgardTollefsen@rhyta.com
EMAIL
Torgard
Tollefsen
Alert Alarm Company
TorgardTollefsen@rhyta.com

TRUE

TiagoSilvaCorreia@gustr.com
EMAIL
Tiago
Correia
Custom Lawn Care
TiagoSilvaCorreia@gustr.com

TRUE

TiagoSilvaCorreia@gustr.com
EMAIL
Tiago
Correia
Custom Lawn Care
TiagoSilvaCorreia@gustr.com

TRUE

TiagoSilvaCorreia@gustr.com
EMAIL
Tiago
Correia
Custom Lawn Care
TiagoSilvaCorreia@gustr.com

TRUE

The problem is, that the staging at SAP Hybris Marketing does not allow the import of multiple updates of the same contact within the same package.

Due to the above finding, we recommend that the following structure of the CSV import file is applied to prevent multiple updates of same contact.

TorgardTollefsen@rhyta.com
EMAIL
ONLINE_SHOP
SHOP_ITEM_VIEW
20170916220001

59369337
10702933

Torgard
Tollefsen
Alert Alarm Company
TorgardTollefsen@rhyta.com

TRUE

TorgardTollefsen@rhyta.com
EMAIL
ONLINE_SHOP
SHOP_ITEM_VIEW
20170916220002

59369337
10702933

TorgardTollefsen@rhyta.com
EMAIL
ONLINE_SHOP
PROD_REVIEW_VIEW
20170916220003

59369337
10702933

TorgardTollefsen@rhyta.com
EMAIL
ONLINE_SHOP
PROD_REVIEW_VIEW
20170916220033

59369337
10702933

TiagoSilvaCorreia@gustr.com
EMAIL
ONLINE_SHOP
SHOP_ITEM_VIEW
20170916220000

-19800431
-4464859

Tiago
Correia
Custom Lawn Care
TiagoSilvaCorreia@gustr.com

TRUE

TiagoSilvaCorreia@gustr.com
EMAIL
ONLINE_SHOP
PROD_REVIEW_VIEW
20170916220001

-19800431
-4464859

TiagoSilvaCorreia@gustr.com
EMAIL
ONLINE_SHOP
SHOP_ITEM_ADD
20170916220031

-19800431
-4464859

If you are sure that the contacts for the interactions to be imported already exist, you can also use the following structure:

ContactIdOrigin
CommunicationMedium
InteractionType
Timestamp
CampaignId
InitiativeId
InitiativeVersion
Valuation
Reason
IsAnonymous
Amount
Currency
Latitude
Longitude
SourceObjectType
SourceObjectId
SourceObjectAdditionalId
SourceDataUrl
ContentTitle
ContentData
FirstName
LastName
CustomerName
EMailAddress
PhoneNumber
MobilePhoneNumber
IsContact
IsConsumer

EMAIL
ONLINE_SHOP
SHOP_ITEM_VIEW
20170916220001

59369337
10702933

EMAIL
ONLINE_SHOP
SHOP_ITEM_VIEW
20170916220002

59369337
10702933

EMAIL
ONLINE_SHOP
PROD_REVIEW_VIEW
20170916220003

59369337
10702933

EMAIL
ONLINE_SHOP
PROD_REVIEW_VIEW
20170916220033

59369337
10702933

EMAIL
ONLINE_SHOP
SHOP_ITEM_VIEW
20170916220000

-19800431
-4464859

EMAIL
ONLINE_SHOP
PROD_REVIEW_VIEW
20170916220001

-19800431
-4464859

EMAIL
ONLINE_SHOP
SHOP_ITEM_ADD
20170916220031

-19800431
-4464859

Recommendation:

Ensure the correct structure of CSV import file for interactions!

Error Handling / Processing of Big Files

As mentioned above, iFlow includes a validation to ensure the consistency of data in the file. However, a real error handling does not exist.

This means, in case of a validation error at HCI the complete CSV file is processed again, unless the issue is corrected in the source CSV file.

The straightforward flow is as follows:

* You need to import a CSV file to SAP Hybris Marketing and copy the file to the SFTP server.
* After approx. five minutes the file is fetched from HCI and is processed there.
* If you don’t experience any errors within HCI, the data is imported into SAP Hybris Marketing.
* Then the CSV file is copied to the archive folder at HCI.

In case of a validation error at HCI, the CSV file is not copied to the archive file. After some time, the same CSV file is fetched from HCI and processed again. This means that all the data is send to SAP Hybris Marketing again.

Imagine you have a CSV file consisting of 1 million data sets and only one data set has an error. In this case the complete end to end flow is processed as long the error is corrected within the CSV file.

In the CSV file, every data set has a timestamp. With the help of this timestamp, SAP Hybris Marketing can recognize that the received data is obsolete and does not import it again.

However, when you copy several big CSV files with errors to the SFTP server, it can happen that a high workload is caused at SAP Hybris Marketing possibly resulting in bad system performance for end users.

Recommendation:

For a better error handling, do not import too big CSV files. In addition, it is easier to find and correct an error in a rather “small” CSV file. Regarding overall performance, we could not identify a significant difference between “small” and “big” CSV files.

Log Level Configuration

When you deploy an iFlow, the log level is by default set to “Info”.

Recommendation:

Change the log level to Debug at Monitor -> Monitor Integration Content

Monitoring at SAP Hybris Marketing

A successful processing of a CSV file at HCI does not automatically include a successful data import at SAP Hybris Marketing.

A successful processing at HCI only means that the data could be passed to the staging area (new with release 1708) of SAP Hybris Marketing. The staging area is a kind of an inbound queue, where the imported data is stored and then further processed by the application.

In the staging area some system depending checks are performed. For instance, when you import a contact with an unknown facet type the contact is blocked in the staging area. This means, that you can find the contact in the application Import Monitor with an error, but not in the Contacts application.

When the validation checks of the staging area are okay, the data is passed on to the application, where the import takes place. If an error occurs during the import of an object you can find the error message in the Application Log app, for example, the format of phone number is incorrect for a contact.

Only if staging area and data import are successful you can find the object in the corresponding application at SAP Hybris Marketing.

Recommendation:

If a CSV file was processed successfully at HCI also monitor the apps Import Monitor and Application Log at SAP Hybris Marketing.

Performance Measurements

We executed some performance measurements. To do so, we have prepared CSV files with 10000 and 100000 data sets for contacts and interactions.

iFlow Contacts:

For a CSV file with 100000 data sets, it took 10 minutes on average to process all data sets within HCI (without the transfer time from SFTP to HCI – as the transfer time strongly depends on network bandwidth).

We observed a short processing time for the step “CSV to XML Convert” – approx. 14 seconds. Most of the time is consumed for the OData requests (sending a data package to SAP Hybris Marketing and waiting for response before next package can be sent from HCI to SAP Hybris Marketing).

For CSV files with 10000 data we observed more or less the same linear behaviour regarding performance.

Summary:

For a contact we observed an average processing time of 6 milliseconds per data set within HCI.

iFlow Interaction:

The performance of the Interaction iFlow is not as convincing as the Contact iFlow as the Interaction iFlow triggers two OData calls – one for interactions and one for contacts. For example, when you have a CSV file with 100000 data sets, in fact 100000 interactions and 100000 contacts are transferred to SAP Hybris Marketing.

Summary: For an interaction we observed an average processing time of 13 milliseconds per data set within HCI.

Parallelization

Within a HCI Tenant

It is possible to parallelize the process within a HCI tenant. At the step “Splitter”, you can activate parallel processing. At this step, the content is split in packages with 1000 entries per package. The packages are then sent to SAP Hybris Marketing via OData call. This process can be parallelized by activating the checkbox Parallel Processing.

Without parallelization, we observed that the Contact iFlow takes an average processing time of 6 milliseconds per data set within HCI.

With parallelization, the processing time could be decreased by factor 4: It took an average processing time of 1.5 milliseconds per data set.

With Several Nodes

If you want to execute a simultaneous load with several nodes, you have the option to request additional resources via a support ticket on component LOD-HCI-PI-OPS.

We did not test with several nodes.

Recommendation:

In case of simultaneous load, reduce the Max. Messages per Poll from value 20 to 1 or 2. By this change, you can ensure that the files at SFTP server are fetched from several nodes. http://bit.ly/2idxniO #SAP #SAPCloud #AI

November 14, 2017