The process of creating and generating SDTM datasets. Who is responsible for this?

The process of creating and generating SDTM datasets. Who is responsible for this?


The clinical trials data flow is highly defined, with standardized roles and procedures. However, there are nuances in the process flow that can have an impact on submission timing!


The duty for developing the statistical analysis strategy and analysis artifacts obviously belongs to statisticians and statistical programmers, while data managers are in charge of data collection and cleaning. However, the question of who is in charge of creating and generating SDTM datasets remains unanswered.


Is it data administration? What about statistical software developers? Is it something that should be done manually, or should a data standards group be formed?

When does Data Management generate SDTM datasets?

When Data Management is done correctly, the result is SDTM-compliant! Data managers may devote more time to maintaining their data and ensuring that it is of the finest quality, freeing up stats programmers to develop analytical datasets and TFLs based on their statistical expertise and abilities. A specialist staff capable of focusing just on SDTM datasets and having the time to keep up with changes to any standards, as well as a thorough understanding of SDTM and submission packages, would be a distinct benefit.


It would also reduce the requirement for statistical programming labor as a result of the eCRF’s less-than-perfect design and Data Management’s lack of standards implementation. SDTM generating is data generating rather than analysis. Hence it is more closely related to data management. Many SDTM developers would not program in ADaM or TFLs, for example. There is a disparity in data knowledge between data management and statistical computer programmers.

Why are statistical developers generating SDTM?

Typically, it comes under Stats programming since certain transformations are required for programming, and they are usually staffed by experienced SAS programmers who can produce the datasets. Again, this is mostly determined by particular firms, but the key reason is agility in programming and an overall capacity to program (in SAS presently) based on standards and requirements for the production of SDTM. 


It is probable that where the define.xml output is created is significant since the same developers may be needed for that work in a different stage of the procedure! Due to their grasp of the data and standards, data managers tend to pick up on many elements that the programming team misses.


How does the company’s size and scale affect SDTM responsibility?


At small-to-medium-sized businesses, projects like define.xml output are housed under Stats Programming. However, on a huge pharma enterprise-scale, Data Management is brought in to offer standards and procedures that span dozens of trials taking place at hundreds of locations throughout the globe. 


At the global pharma level, the regulatory burden for clinical trial data security, privacy, and code validation is enormous. That is what Data Management is designed to manage, allowing statisticians to focus on statistics.

The future of automation

Perhaps a better question would be, “Who should do this?” rather than “How should we develop SDTM data?” or even “How should we create SDTM data?” “How to generate SDTM datasets?” But that solution would need a complete adjustment in how things have traditionally been done, as well as a considerable transformation in thinking! 


SDTM is assumed to be driven by a well-defined Metadata Repository (MDR) for standardization within data management. Some of the large drug corporations have tried, with varying degrees of success.


Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *