Skip to content

Structured Data Transformation Language (SDTL) is an independent intermediate language for representing data transformation commands. Statistical analysis packages (e.g., SPSS, Stata, SAS, and R) provide similar functionality, but each one has its own proprietary language. SDTL consists of JSON schemas for common operations, such as RECODE, MERGE FILES, and VARIABLE LABELS. SDTL provides machine-actionable descriptions of variable-level data transformation histories derived from any data transformation language. Provenance metadata represented in SDTL can be added to documentation in DDI and other metadata standards.

Supports Activities:

  • Generate key machine-actionable metadata on production processes for inclusion in DDI and other
  • Capture transformation processes for provenance purposes
  • Capture a metadata life cycle that parallels the data life cycle
  • Capture processing information in a structure that can be used to create syntax for a range of statistical packages

License

SDTL – Structured Data Transformation Language is free software you can distribute and/or modify under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license (external site).

Development Work

SDTL was created by the Continuous Capture of Metadata Project supported by the Data Infrastructure Building Blocks (DIBBs) program of the National Science Foundation through grant NSF ACI-1640575.

Future Work

SDTL is maintained and managed by the SDTL Working Group (external site)

Selected Articles

Version 1.0 [current version]

Publication date: 2020-12-01