CDM tables mapping files and descriptors¶
The following section details the mapping sequence that the cdm-mapper tool follows to map meteorological data to a CDM element.
We will use part of the header.json
python dictionary from the icoads_r3000
IMMA1 model to explain how we map an element. In the table below we explain all elements attributes and/or descriptors, that are needed in each python dictionary or .json
file, for a successful mapping of the input meteorological data.
Below we see content from a header.json
file:
{
"report_id": {
"sections": "c98",
"elements": "UID",
"transform": "string_add",
"kwargs":{"prepend":"ICOADS-30","separator":"-"}
},
"application_area": {
"default": [1,7,10,11]
},
"observing_programme": {
"sections": "c1",
"elements": "PT",
"transform": "observing_programme"
},
"report_type": {
"default": 0
},
"platform_type": {
"sections": "c1",
"elements": "PT",
"code_table": "platform_type"
},
"platform_sub_type": {
"sections": "c1",
"elements": "PT",
"code_table": "platform_sub_type"
},
"location_accuracy": {
"sections": "core",
"elements": ["LI","LAT"],
"transform": "location_accuracy",
"decimal_places": 0
},
"station_speed": {
"sections": "core",
"elements": ["YR","VS"],
"code_table": "ship_speed_ms",
"decimal_places":1
},
"source_id": {
"sections":["c1","c1","core","core"],
"elements": ["SID","DCK","YR","MO"],
"transform": "string_join_add",
"kwargs":{"prepend":"ICOADS-3-0-0T","separator":"-","zfill_col":[0,3],"zfill":[3,2]}
},
"source_record_id": {
"sections": "c98",
"elements": "UID"
}
}
Descriptors¶
Descriptor variable name |
Function |
|
String or list of strings with the element name (s) to map in the CDM table.
e.g.
report_id information is store in the imma1 schemaas
UID , this will be the variable name assigned to the element attribute of
report_id . |
|
String or list of strings with the section name(s) from which the element(s)
to be map will come from.
- Use a single string to define a unique section if all the elements are
located in the same section, e.g.
location_accuracy : the variables["LI","LAT"] come from a single section core in the IMMA1 model.- Use a list of strings to declare variables that come from multiple
sections and elements. e.g.
source_id - Always respect the order of the sections in the original schema.
|
|
Assigns a default value to the CDM element.
|
|
Value to assign for missing data (NA/NaN). Datetime objects not supported.
|
|
Name of the function to be used to perform the mapping of a specific element.
This function must be defined in the
mapping_functions class of theimodel.py module in order to be access by the mapper tool. |
|
Keyword arguments of a transform function if any.
Type dictionary with the format: {
keyword :value ,…,} |
|
Code table name in the imodel mapping library needed to perform the mapping
a particular element. Type: string.
|
|
Number of decimal places to keep when printing an element.
Type: integer
value, a function name used to estimate this figure.
Such function should be defined in the same way as the transform function
but these cannot take keyword arguments.
decimal_places = 0 for integerelements defined as numeric in CDM or the element will be printed with
default number of decimal places.
|
Mapping sequence¶
The mapper parses the mapping file element by element and takes the following steps:
- Clean imodel data
Remove any missing
elements
from the imodel. This preliminary step makes the definition of mapping functions easier, as no NaN handling needs to be added to the functions and integer fields casted to float by NA/NaN presence is reverted.
- Map CDM element in the following order:
If
transform
: eval function and apply with elements and|orkwargs
as appropriateElse if
code_table
: map imodel elements using the definedcode_table
Else if
elements
: assign imodel elements to CDM elementElse if
value
: assign value to CDM element
Fill CDM element NA/NaN values using default if defined
Define the number of decimal places in the CDM element attributes, so this gets pass to the table writer if ``decimal_places`` is provided
Defining mapping functions¶
In the file imodel.py
the user can define any function to transform any element in the data model. The python file needs to be accompanied with __init__.py
file so all the functions written in imodel.py
can be imported by the cdm-mapper toolbox.
Note
Remember that any new python dependency that you import
the top of your imodel.py
must be installed also in your python environment.
The cdm-mapper follows a set of rules that need to be taken into account when it comes to adding functions to the imodel.py
script.
The cdm-mapper only parses elements to the transforming function (e.g. Year, day or hour) or
code_table
mapping (e.g. platform_subtype), where none of the elements to be map (e.g. Year, day, hour or platform_subtype) have missing values.The output of all functions in
imodel.py
must respect the element type defined in the imodel mapper.
Code tables¶
Elements defined in the imodel.json files (e.g. elements inside header.json
) with the attribute code_table
have an specific “key” that links the element variable to its corresponding numerical code defined in the C3S CDM. Code tables contain the key:value
pairs and are stored as individual .json
files in the lib/mappings/imodel/code_tables
subdirectory.
The content of a code table translating platform_sub_type
information into the appropriate CDM syntax’s (platform_sub_type.json
) can be seen in text below:
{
"7": 69
}
This code table is part of the icoads_r3000
data model included in this tool.
The following range of code table structures are currently supported:
Simple code tables: code tables with a list of
key:value
pairs.Nested code tables: code tables with multiple (2 or more) keys mapping to a value
-> key(1):…:key(n):value.
Range-keyed code tables: code tables (simple or multi-keyed) where one or more keys is a (integer) range of values.
For more information on code tables and their structure check out the mdf_reader tool - code tables information.
The code table above, is use by the icoads_r3000
imodel to map platform_sub_type
information to the C3s CDM format, this is done in the following section of the header.json
file:
"platform_sub_type": {
"sections": "c1",
"elements": "PT",
"code_table": "platform_sub_type"
}
The “key” in this case, will be the value read from the ICOADS section c1
and element PT
, for key values equal to 7 a 69 code will be assigned.
Code tables can be also used for simple transformations of the elements, depending on the medata data to map. e.g. The case of deck 701, where we expand ship names to the ships original full name. We do this by reading meta data information from the c99
ICOADS supplemental data attachment. The imodel for deck 701 provides a code table to transform the names into the ships original name format recorded in the original ship logbook (to see the ship_names.json
code_table click in the following file):
"station_name": {
"sections": "core",
"elements": "ID",
"code_table": "ship_names"
}