Apache Spark
DRAFT This mapping definition is work in progress and may be subject to further change.
CDS | Spark / Delta Lake | Datasphere | Comment |
---|---|---|---|
cds.Boolean | BOOLEAN | cds.Boolean | |
cds.String (length ) | STRING | cds.String | Datasphere Logic: IF cds.String(length = undefined) THEN cds.String(length = 5000) |
cds.LargeString | STRING | cds.LargeString | TODO: Check this. No length Limit? |
cds.Integer | INT | cds.Integer | |
cds.Integer64 | BIGINT | cds.Integer64 | |
cds.Decimal (precision, scale) | DECIMAL(p,s) | cds.Decimal | Datasphere Logic: IF cds.Decimal(p < 17) THEN cds.Decimal(p = 17) |
cds.Decimal (precision = 34, scale = floating) | not supported | cds.DecimalFloat | Decimal with scale = floating is not supported in spark |
Amounts with Currencies cds.Decimal (precision = 34, scale = 4) | cds.Decimal(34, 4) | cds.Decimal(34, 4) | Since spark does not support cds.DecimalFloat we use cds.Decimal(34,4) as compromise for now |
cds.Double (precision, scale) | DECIMAL(p,s) | cds.Double | Datasphere Logic: IF cds.Double (precision, scale) THEN cds.Double() (delete precision and scale) |
cds.Date | DATE | cds.Date | |
cds.Time must be expressed as cds.String(6) or cds.String(12) depending on the source representation for now + the annotation @Semantics.time: true | STRING | cds.String(6) or cds.String(12) | Data is in format HHmmss or HH:mm:ss.SSS - consumer must use the function to_time() to convert to cds.Time |
cds.DateTime sec precision | TIMESTAMP | cds.Timestamp | |
cds.Timestamp µs precision | TIMESTAMP | cds.Timestamp | |
cds.UUID + the annotation @Semantics.uuid: true | STRING (36) | cds.UUID |