Apache Spark

DRAFT This mapping definition is work in progress and may be subject to further change.

CDS	Spark / Delta Lake	Datasphere	Comment	Spark format
`cds.Boolean`	BOOLEAN	`cds.Boolean`
`cds.String` (length )	STRING	`cds.String`	Datasphere Logic: IF `cds.String(length = undefined)` THEN `cds.String(length = 5000)`
`cds.LargeString` (length )	STRING	`cds.LargeString`
`cds.Integer`	INT	`cds.Integer`
`cds.Integer64`	BIGINT	`cds.Integer64`
`cds.Decimal` (precision = p, scale = s)	DECIMAL(p,s)	`cds.Decimal`	Datasphere Logic: IF `cds.Decimal(p < 17)` THEN `cds.Decimal(p = 17)`
`cds.Decimal` (precision = p, scale = floating)	not supported	`cds.Decimal`	Decimal with scale = floating is not supported in spark
Amounts with Currencies `cds.Decimal` (precision = 34, scale = 4)	`cds.Decimal(34, 4)`	`cds.Decimal(34, 4)`	Since spark does not support `cds.DecimalFloat` we use cds.Decimal(34,4) as compromise for now
`cds.Double`	DOUBLE	`cds.Double`
`cds.Date`	DATE	`cds.Date`		"yyyyMMdd"
`cds.Time` must be expressed as `cds.String(6)` or `cds.String(12)` depending on the source representation for now + the annotation `@Semantics.time: true`	STRING	`cds.String(6)` or `cds.String(12)`	Data is in format `HHmmss` or `HH:mm:ss.SSS` - consumer must use the function to_time() to convert to `cds.Time`
`cds.DateTime` sec precision	TIMESTAMP	`cds.Timestamp`
`cds.Timestamp` µs precision	TIMESTAMP	`cds.Timestamp`		"yyyy-MM-dd'T'HH:mm:ss.SSSSSSS"
`cds.UUID` + the annotation `@Semantics.uuid: true`	STRING (36)	`cds.UUID`