ADaM Validation Webinar Q&A

Define.xml

  1. Did you say that version 1 of define.xml can not be used for ADaM metadata?
    Yes, define.xml v1.0 has many limitations. For example, it cannot describe Value Level Metadata which is crucial for ADaM. Define.xml v1.0 does not allow to specify what particular variable is an object for Value Level Metadata. For ADaM Define.xml v2.0 is required. As stated on page 98 of the Define 2.0 specification, “Because analysis datasets are developed to support specific analyses, ADaM has additional metadata that is not found in SDTM or SEND, notably analysis results metadata. Additionally, the metadata to describe variable sources and derivations is of primary importance. Define-XML version 2.0.0 can be used to transmit metadata for the ADaM IG v1.0 and higher”
  2. For define.xml, can define 1.0 still be used – as long as there is not analysis Results metadata.
    No, define.xml v1.0 cannot be used for ADaM data. For example, it’s not capable to define Value Level Metadata. We believe, that these outdated version must be retired as soon as possible in favor of a much more robust Define.xml v2.0.

FDA Usage and Guidance

  1. What happens when these common issues are found by FDA? Does FDA ask sponsors to fix them before FDA starts the review?
    The decision on whether to request a fix is made on a case by case basis by the reviewer who is responsible for a particular submission. However, we recommend you fix all identified data issues in advance to reduce the risk of delays during the review process.
  2. FDA released SDTM and SEND “FDA Validation Rules”. Will FDA now release ADaM “FDA Validation Rules” or do you expect they will sometime in the future ?
    Only FDA may talk about their plans.
  3. Can you address the FDA’s plans on future use of the OpenCDISC Validator?
    Only FDA may talk about their plans. Currently FDA is using a custom implementation of OpenCDISC Enterprise (a.k.a DataFit) to assess data compliance and quality of submission data. FDA is very supportive of open source and free software like OpenCDISC Community. We believe that both FDA and industry will benefit from OpenCDISC project.

OpenCDISC Community availability and features

  1. How long does it typically take to run OpenCDISC for a phase III study with 1000 subjects?
    Typical validation time for 1000 subject study is 5-10 minutes. Performance is not linked to phase or subject count in any way other than it might be expected that Phase III trials have more patients and data. Performance is a factor of dataset records, rules executed and available machine resources such as memory, CPU, I/O and the type of JVM (32 vs 64 bit). Users can allow the Validator to use more memory and CPU threads depending on how many cores the machine has. There is a Performance and scalability guide published on opencdisc.org which can guide you. However this question cannot even be estimated unless the aforementioned parameters were provided. The Validator is exceptionally fast in most cases we’ve seen. However, for very large trials you may need a server-based environment like OpenCDISC Enterprise to achieve desired performance.
  2. I saw you used OpenCDISC ADaM Validation Rules v1.6 in your demo, but you just mentioned version ADaM Validation rules v1.3 will be in the next release of OpenCDISC, can you clarify?
    It may be confusing. There are separate versions for OpenCDISC engine (e.g., v1,5 or 2.0.1) and validation specifications. There are also versions of external documents we refer in our validation rules.
    This is best explained on slide 8. OpenCDISC Validation rules maintain their own versions which trace to CDISC Publications. Currently the latest version of OpenCDISC Validations rules is 1.6, which maps to CDISC ADaM rules 1.3, and is currently only available in the Enterprise version.
  3. I saw you were using the Enterprise version, Do you think the Community version has the same functionality?
    The Validator Community version provides basic functionality. It supports official FDA SDTM/SEND business rules and CDISC ADaM checks. The Community version is designed as a desktop application for individual usage (for example QC). Enterprise version is a designed as a company-wide compliance and data quality collaborative environment. It has much more robust functionality. Please visit Pinnacle21.net website for details.
  4. To use the OpenCDISC ADaM validator on the current level of ADaM validation checks is no longer free, right? OpenCDISC is no longer really free, or is there a plan to make the 1.3 level checks free?
    OpenCDISC Community is free and will include new ADaM checks in one of the upcoming releases. We typically release new rules to Enterprise users first as a benefit of being a sponsor of OpenCDISC.
  5. When will the next version of Community be released?
    The next patch release of Community will be released in July to update all controlled terminology files and include minor bug fixes to 1 or more of the 4 tools. The next minor release (v2.1) is targeted for September 2015.
  6. Will the new ADaM rules be available (which include ADAE, TTE and others) in the OpenCDISC Community tool or is it for Enterprise only?
    The new ADaM validation rules will be released to Community later this year. All new rule updates are released to Enterprise version first.

Dataset recognition and ADaM Standard Naming conventions

  1. If we combine two domains from SDTM (i.e. AA and BB) for ease of programming in ADaM, should the ADaM dataset be named one of the following (ADAA, ADBB, ADXX)? Will OpenCDISC perform checks on non-standard domain names?
    Regardless of naming conventions, OpenCDISC will perform checks on all datasets which can be identified as BDS according to our prototype definition of what we believe defines a BDS dataset. That is having at least one of the these variables mentioned in slide 11: PARAMCD, PARAM, AVAL, AVALC, ADT, ASTDT, ASTDTM, CNSR, CNSDTDSC, EVNTDESC. Regarding naming conventions, According to CDISC ADaM v2.1, section 4.1.2, page 10 ……with exception of reserved names of ADSL and ADAE “Analysis datasets must be named using the convention “ADxxxxxx”, where xxxxxx portion of the name is sponsor-defined, using a common naming convention across a given submission or multiple submissions for a product. In developing naming conventions, sponsors should consider requirements from eCTD guidance document and SAS Transport format (e.g., name cannot exceed 8 characters). My opinion is that if you combine two SDTM datasets, you could apply a consistent naming pattern such as ADAABB where you simply combine the two letter domain codes using the more dominant one first, or the one that would normally occur in the trial first. For example, combining Medical History and Laboratory could be ADMHLB since medical history will always be collected and occur naturally in time prior to lab tests. This may be useful if medical history contains historical lab tests used as a meta-baseline.

Calculation Rules

  1. For derived numerical values in ADaM data, what is the recommendation on rounding: round in ADaM or in output program?
    We recommend to discuss and define a value rounding approach in the study SAP.
  2. When you built your check for PCHG, did you use rounding?
    Yes. Specifically it uses the java.math.RoundingMode.HALF_UP enumeration constant, which rounds towards “nearest neighbor” unless both neighbors are equidistant, in which case it rounds up. This is typically how everyone is taught in school.
  3. For the Arithmetic check how does one set the precision for the check? (e.g.: chg=0.24 and the check has a value 0.2434)
    The precision of the check cannot be explicitly set in the rule, however the expected precision can be prescribed in the define.xml using the SignficantDigits attribute.
  4. In checking PCHG = (AVAL – BASE)/BASE *100, how many decimals does OCV consider?
    8 decimals places using java.math.BigDecimal.divide method which takes a divisor, scale and roundingMode.
  5. I think your PCHG formula is incorrect. You would get wrong PCHG if Baseline is negative number and post baseline in positive number.
    Yes i can see your point, suppose we had these formulas, where
    [BASE = -2 , AVAL = 4]: (4 – -2) / -2 * 100 = -300%,
    [BASE = 2 , AVAL = 8]: (8 – 2) / 2 * 100 = 300%,

     

    In both cases, the BASE was increased by 6 with a percent change increase of 300%. We could enhance the check to take the absolute value.

Traceability Rules

  1. “On-treatment” (ONTRTFL) can be defined clinically and scientifically based on half life of the compound. A protocol may define 2 days after last dose as on treatment. Does OCV allow some lags in the treatment period?
    No. The rule will fire if your ADT is not within the treatment start and end. If you feel max treatment date should reflect the half life, you could probably add that date in addition to the last date the treatment was actually administered.
  2. When a subject screen failed and re-screened and is then randomized into a study, the SDTM domain DM for this subject has unique SUBJID(s) but same USUBJID when screen failures are included in SDTM. Will some of the ADaM-to-SDTM checks fail?
    Yes, some checks will fail in both SDTM and ADaM. in general, this seems problematic for both standards. SD0083 will fail for SDTM DM regardless and AD0054 will fail in ADaM ADSL if both screening and randomized entries are included. Rules AD0205, AD0208, AD0209 and AD0210 compare ADSL AGE, SUBJID, SITEID and ARM to SDTM.DM. They will pass if the match happens to be on the randomized patient in DM, but fail if the screen failure is matched. We could possibly add a clause to only check randomized patients in DM.
  3. You mentioned the USUBJID has to be unique in the ADSL, how to handle a re-screened patient with multiple subjid? Per FDA conformance guide, the patient needs to have same usubjid in an application.
    Much of the answer is the same as the above question. However the above question assumed that only DM had duplicate USUBJIDs. If ADaM has duplicate USUBJIDs, then AD0054 will fail and possibly the other traceability checks as mentioned above. However i am not sure why ADSL would include the screen failure record.
  4. Are you suggesting at all that supporting SDTM datasets for traceability be submitted with the ADaM datasets (i.e. in same folder), or just to include them as part of OpenCDISC validation?
    The datasets do not need to be in the same eCTD folder, and in fact they should not. ADaM and SDTM have there own folders for regulatory submission. You just need to include them in the same data package for validation. E.g., in Validator Community you can use “Add more files” link to include AE, DM and EX domains with ADaM datasets.
  5. For integrated analysis, will integrated SDTM data be required for integrated ADaM data?
    Unfortunately there are no standard or formal regulatory guidance documents on integrated data. Our recommendation is to discuss your plan on ISS/ISE analysis with reviewers as suggested by FDA Technical Conformance Guide.
  6. For traceabilty ADaM to SDTM, would you suggest us to include all corresponding SDTM datasets when we use OpenCDISC Enterprise to validate ADaM datasets, for example, DM for ADSL, AE for ADAE, etc?
    Yes, you should include AE, DM and EX domains when validating ADaM data.

Consistency Rules

  1. It seems that one PARAMCD should be able to have multiple PARCAT, e.g., PARAM=’Glucose’, PARCAT1 can be ‘Urinalysis’ or ‘Chemistry’.
    No, this is invalid approach. PARCATy is a categorization of PARAM and PARAMCD. For example, value of PARCAT1 might group the parameters having to do with a particular questionnaire, lab specimen type, or area of investigation. Correct way is to include all details into Parameter name. E.g., PARAM=”Blood Glucose, mg/dL” and “Urine Glucose, mmol/L”. Then use PARCAT1 to group all Chemistry or Urinalysis records. Please note, that you cannot then use the same Parameter for different Parameter Category values. Please see ADaM IG for details.

Tech Conformance Rules

Please let us know if you find any missing SAS date, time or datetime format in our current specifications. We recognized that some SAS date/time formats have also different representation a.k.a alias which should be added into our specs as well.

  1. Does OCV recognize SAS numeric dates such as Date9?
    Yes. Because all dates in SAS are actually numbers (refer to slides 23 and 37), we recognize a SAS date in any display format.
  2. Currently the Validator contained in OpenCDISC Community does not work correctly regarding identification of correct date, date time and time formats. By when will this be corrected?
    Assuming this question deals strictly with the display format, we would like to know where the exhaustive list of valid SAS display date formats are and which format is failing for you. We recognize the following (note that DATE includes DATE9). In general, many of these can have numbers appear after them.

     

    AD0041 (Date):

    B8601DA, B8601DN , DATE, DAY, DDMMYY, DDMMYYB, DDMMYYC, DDMMYYD, DDMMYYN, DDMMYYP, DDMMYYS, DOWNAME, DTMONYY, E8601DA, JULDAY, JULIAN, MMDDYY, MMDDYYB, MMDDYYC, MMDDYYD, MMDDYYN, MMDDYYP, MMDDYYS, MMYY, MMYYB, MMYYC, MMYYD, MMYYN, MMYYP, MMYYS, MONNAME, MONTH, MONYY, PDJULG, PDJULI, QTR, QTRR, WEEKDATE, WEEKDATX, WEEKDAY, WEEKU, WEEKV, WEEKW, WORDDATE, WORDDATX, YEAR, YYMM, YYMMDD, YYMMDDB, YYMMDDC, YYMMDDD, YYMMDDN, YYMMDDP, YYMMDDS, YYMMB, YYMMC, YYMMD, YYMMN, YYMMP, YYMMS, YYMON, YYQ, YYQB, YYQC, YYQD, YYQN, YYQP, YYQS, YYQR, YYQRB, YYQRC, YYQRD, YYQRN, YYQRP, YYQRS

    AD0042 (Time)

    B8601LZ, B8601TM, B8601TZ, E8601LZ, E8601TM, E8601TZ, HHMM, HOUR, MMSS, TIME, TIMEAMPM

    AD0043 (Datetime)

    B8601DT, B8601DZ, DATEAMPM, DATETIME, DTDATE, DTWKDATX, DTYEAR, DTYYQC, E8601DN, E8601DT, E8601DZ, MDYAMPM, TOD

Webinar Experience

  1. Where else can I ask questions?
    You can submit all questions to the our OpenCDISC Forum or email to our Support Team. And make sure to send the OpenCDISC community errors our way.