Product and Groups Ingestion
Product Ingestion
Introduction
Eagle Eye AIR allows for a list of Retailer’s Products and Product Groups to be imported
automatically. This can be a one-off import during the on-boarding process or a regular import
to ensure the information within the AIR platform is up-to-date.
Initial Setup
The initial setup of the unit structure needs to be completed in Air before products can be
ingested.
Ingestion Overview
The product ingestion is broken into two separate files. One that represents all the products
sold by the retailer and a second file that represents the hierarchy within which those products
exist. Both files are required for the ingestion to run.
The ingestion is a complete replacement every time the ingestion is run for the given unit.
There is no delta ingestion for products as this can easily lead to data inconsistency.
In a basic unit structure (See the Store Ingestion guide for details) the product ingestion is
run against the top level unit. For an advanced structure the ingestion is run at the banner
level making it possible to have different product lists for different company units.
File Formats
Both the products and the groups file have the same requirements around the file format and
a set of encoding, delimiter, encapsulation and forbidden characters as described below.
We require CSV files to have a list of field names as file header (very first line
of the file). All other records (lines) should contain the data to be imported.
Data files may include additional columns (with empty or non-empty values), and we
will ignore them during import. For each supported import mechanism, mandatory
columns have to be present in the file to start ingestion process.
File Encoding
All files must be encoded with Linux line endings rather than Windows.
Content Encoding
The CSV files must be encoded with UTF-8 encoding (ASCII is a subset of UTF-8 encoding).
Non-UTF-8 files or rows containing forbidden characters will not be imported.
Files must not contain a UTF-8 BOM
Field Delimiter
We require comma sign (, or U+002C) as CSV Field Delimiter by default. We require
a double quote sign (" or U+0022) as CSV Text Delimiter by default. We recommend
you use CSV Text Delimiter for all fields values, not just for string fields.
Data Escaping
For product files, we allow the use of special characters (double quotes, slash,
backslash, apostrophe, comma, etc.) as long as they are specially prepared.
- a quotation mark (" or U+0022) should be represented by a pair of consecutive double quotes or prefixed with backslash sign in the CSV content.
e.g "" or \" - other allowed special characters should be prefixed using backslash symbol (\or U+005C).
- additional whitespace outside field values may corrupt CSV file; hence we will not be able to read and process it.
Forbidden Characters
We will not be able to process rows containing special, non-visual characters in
field values, examples include:
- NUL (\0 or U+0000)
- TAB (\t or U+0009)
- LINE FEED (\f or U+000A)
- NEWLINE (\n or U+005F)
- CARRIAGE RETURN (\r or U+000D)
Warning – these characters maybe added by your CSV editor when compiling a list
of stores
File Names
The file naming format is configurable in the Air platform. Once configured, only files that
match the configured file naming format will be ingested. Air also checks for files of the
same name having already been processed and does not process the same file more than once.
Although this is configurable Eagle Eye
proposes you use formats similar to the below:
- Product Master:
<UnitNameWithNoSpaces>-product\_master-<dateCreated>.csv - Product Groups:
<UnitNameWithNoSpaces>-product\_groups-<dateCreated>.csv
Where <dateCreated> is in the format YYYYMMDDHHMMSS. This allows uniqueness.
Group/Hierarchy File
The following fields can be specified in the group ingestion file
| Column | Type | Mandatory | Value Uniqueness Required | Description | Maximum String Length |
|---|---|---|---|---|---|
| group_reference | String | Y | Y | Product Group Reference | 100 |
| type | String | Y | N | Product Group Type | 100 |
| name | String | N | N | Product Group Name | 255 |
| description | String | N | N | Product Group Description | 255 |
| parent_reference | String | N | N | Product Group Parent Reference | 100 |
Please note: Where uniqueness is required, the first occurrence of a record will be ingested.
Example File
The first row of the CSV file is the header and must contain the column names. All other rows
in the file must contain the data to be ingested.
group_reference,type,name,description,parent_reference
"2","EE","Confectionary","Confectionary",
"201","EE","Confectionary Multi Packs","Confectionary Multi Packs","2"
"202","EE","Chocolate Bars","Chocolate Bars","2"
"203","EE","Sharing Bags","Sharing Bags","2"Product Master File
The following fields can be specified in the product master ingestion file
| Column | Type | Mandatory | Value Uniqueness Required | Description |
|---|---|---|---|---|
| short_name | String | N | N | Product Short Name |
| long_name | String | N | N | Product Full Name |
| upc | String | Y* | Y* | Product UPC |
| sku | String | Y* | Y* | Product SKU |
| group_reference | String | Y | N | Group Reference as defined in the Groups File |
| tag | String | N | N | Product Tags (pipe separated) |
| image | String | N | N | Product Image URL |
| brand | String | N | N | The name of the brand for this product |
* One of UPC or SKU must be provided depending on the configuration of your environment
Example File
The first row of the CSV file is the header and must contain the column names. All other
rows must contain the data to be imported. Mandatory column headers must be present in the
file to start the ingestion process. Any additional columns in the file will be ignored
during the import.
short_name,long_name,upc,group_reference,tag,image,brand
"5x Milk Chocolate","5x Milk Chocolate Bars","8726536","201","Meal Deal|Confectionary|Chocolate","https://image3.com/image3.jpg","Brand 1"
"Dark Chocolate Bar","Dark Chocolate Bar","8276345","202","Meal Deal|Confectionary|Chocolate","https://image1.com/image1.jpg","Brand 2"
"Milk Chocolate Sharer","Milk Chocolate Sharing bag","7637265","203","Meal Deal|Confectionary|Chocolate","https://image2.com/image2.jpg",Common Ingestion Issues
Due to the complexity of data in these files it's frequent that the first attempt at
extraction and ingestion does not work exactly as expected. The below sections outline some
common cases where issue arise.
Unescaped Characters
Frequently in product names the double quote character " is used to denote inches. Because the
encapsulation character is also a double quote the file structure gets affected.
UTF-8 BOM Header
Some applications save a utf-8 Byte Order Marker at the top of a CSV file. Having this present
makes the file unreadable and is not supported. Please make sure the file encoding is as per the
above specifications.
Empty Group References or Unknown Groups
All products in the file need to be associated to a group as defined in the product hierarchy
file. Often files are received without any reference or a reference that does not exist in the
hierarchy file.
In these cases the product row in the file is skipped and not ingested.
Missing/Duplicate Files
The ingestion of products requires both a product and a group file. If one of the files is missing
the ingestion will not be able to be processed.
As noted above as well a file with the same name can not be processed more than once. Having
the unique date/time stamp often solves this.
Ingestion Timing
When working with the on-boarding team, a schedule should be agreed on when the ingestion process
should be set to run. We suggest a daily refresh of product data and the suggestion is to run
this process overnight. It's worth considering a gap between the schedule of the extraction
from the retailer systems and the timing of the ingestion into Air. If there is any delay in
extraction and the job runs in Air, then no files will be found.
Updated about 4 hours ago
