m Maria
on

 

When I create define.xml, I notice that the length listed in not based on my datasets (ex. COREF has a length of 200 but in my dataset it is much smaller).  Are there any plans to make this data-driven?  It seems this could be a problem down the road. 

Thanks,

Maria

Forums: Define.xml

l Lex
on January 28, 2012

The define.xml should describe your datasets. It seems that you create a variable with length 200, so the define.xml describes this acurately. So, if you want a define.xml to display a smaller length, you should change the length in the dataset.

See also: CDER Common Data Standards Issues Document (Version 1.1/December 2011)
(http://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/FormsSubmissionRequirements/ElectronicSubmissions/UCM254113.pdf)

"For both CDISC and non-CDISC datasets, in order to significantly reduce dataset file sizes, the allotted character variable length/size for each column in a dataset should be the maximum length used. Lengths/Sizes of columns should not arbitrarily be set to 200, For example, if your USUBJID column has a maximum length of 18 being used throughout the dataset, the USUBJID’s column size should be set to 18, not to 200. Alternative solutions to this problem that involve some inclusion of a small amount of padding to column width may be acceptable as long as they don’t result in significant increases in file size due to the padding."


Disclaimer: The opinions expressed above are my personal thoughts and may not reflect the opinions of my employer  (SAS ) or CDISC.

 

 

 

m Maria
on January 30, 2012

Hi Lex,

Thanks for your response.  When I create the define.xml from OpenCDISC, the length does not represent the actual variable length in my dataset.  For instance in define.xml DM.RACE is given a length of 200 (variable length is 32); --SEQ is a length of 10 with 5 significant digits, datatype=float (I am using integers with 1 signficant digit and a length of 8 in my dataset).

Since I am still trying to figure out a process to develop define.xml (we are not yet submitting in this format), I was wondering if others are manually correcting this to match their data or is there a way to pull this information from your datasets when creating define.xml? 

Thanks for your help!!

Maria

d Dirk
on February 1, 2012

Hey Maria,

We create the define file from our specification files, in the specification file all the attributes must be declared. The define creation is a relatively simple SAS program that writes text to a file (as put statements, there are alternatives, but this way we had best control over it).

It would also be possible to get the attributes from the actual data as well, but we have decided that this leaves too much room for error , i.e. the data is not necessarilly what you wanted it to be, there may be mistakes, you will not spot them if the define is created off the data. (Aso on the Controlled Terms: maybe not all values that are possible are in the data while you do want all of them in the define)

Well there's a lot more to it: we check the data against the specifications to the level of the cells, etc. Also about specifications vs. data: Which is to be the master there is a lot to say for both sides and even more for a hybrid situation... There's lots of food for thought here, maybe even process-altering thoughts, :-)

m Maria
on February 3, 2012

Thanks, Dirk!  I think I'll roll up my sleeves and try out a SAS program to create define.xml.

Maria

l Lex
on February 3, 2012

Well, you would not have to start from scratch:

http://www.lexjansen.com/pharmasug/2011/SAS/PharmaSUG-2011-SAS-HW02.pdf

http://support.sas.com/rnd/base/cdisc/cst/index.html

Lex Jansen

Disclaimer: I work at SAS.

Want a demo?

Let’s Talk.

We're eager to share and ready to listen.

Cookie Policy

Pinnacle 21 uses cookies to make our site easier for you to use. By continuing to use this website, you agree to our use of cookies. For more info visit our Privacy Policy.