Datxplore2018R1 documentation

1.Home

Version 2018

This documentation is for Monolix Suite 2018.
©Lixoft

Purpose

Datxplore is a graphical and interactive software for the exploration and visualization of data. Data set under consideration are for population modeling as defined here. Datxplore provides various plots and graphics (box plots, histograms, survival curves…) to study the statistical properties of discrete and continuous data, and to analyze the behavior depending on covariates, individuals, etc. It can be used to

  • See the PK dynamics over time with the possibility of display in log-scale.
  • See if there are outliers.
  • See the PD versus the PK.
  • Split, color and filter the data set to see the dependency of the outcome w.r.t. the covariates.
  • See a covariate w.r.t. another covariate and see the correlation.

Layout

The graphical interface comprises three parts in addition to the menu. A Welcome frame, a Data frame to manage your data set and a Dataviewer frame to explore the data set.

Menu

In the menu, the user can find:

  • Project: Create, load or save a project.
  • Help: About. License.

Data selection

The considered datasets are dedicated to population modeling. The population approach describes phenomena observed in each of a set of individuals and the variability between individuals. The data is thus individual data, and is often longitudinal (over time).  For each subject, the dataset contains measurements, dose regimen, covariates etc … i.e. all collected information.
The first thing to do is to choose a data set and tag the columns to column-types as defined here.

Data manipulation

Used to select, filter or split data. See Data manipulation section.

Data visualization

In the Dataviewer frame, the user can see all the possible graphics. See Data visualization section.

2.Data selection

Data set structure

The data set structure contains for each subject measurements, dose regimen, covariates etc … i.e. all collected information. The data must be in the long format, i.e each line corresponds to one individual and one time point. Different types of information (dose, observation, covariate, etc) are recorded in different columns, which must be tagged with a column type (see below). The column types are very similar and compatible with the structure used by the Nonmem software (the differences are listed here). The column-types are specified in the Data tab, when the user selects a column-type for each column of the data set as in the following picture. Datxplore often provides an initial guess of the type of the column depending on the column headers of the data set.

Description of column-types

The first line of the data set must be a header line, defining the names of the columns. The columns names are completely free. In the MonolixSuite applications, when defining the data, the user will be asked to assign each column to a column-type. The column type will indicate to the application how to interpret the information in that column. The available column types are given below:

Column-types used for all types of lines:

Column-types used for response-lines:

Column-types used for dose-lines:

Labeling

The name of the outputs appearing in the Dataviewer tab are yX with X corresponding to the identifier given in the OBSERVATION ID column (for instance y1 and y2 if identifiers 1 and 2 were used in the OBSERVATION ID column). When no OBSERVATION ID column is present, the observations will be called y. Covariates appear with the same name as used in the column header name.

Loading a new data set

To load a new data set, click on “Browse”  (green highlight below) and use the pop-up window to select your data set, tag all the columns (blue highlight), and click on the button ACCEPT (purple highlight) as on the following figure:

Observation types

There are three types of observations that must be tagged in the OBSERVATION column in Data tab:

  • continuous: The observation is continuous with respect to time. For example, a concentration is a continuous observation.
  • discrete: The observation values are on a discrete scale. For example, the observation can be a categorical observation (an effect can be observed as low, medium, or high) or a count observation over a defined time (the number of epileptic crisis in a defined time).
  • event: The observation is the time elapsed until an event occurs, for example cancer recurrence.

 

 

3.Data manipulation

Interacting with the plots

Within the frame Dataviewer, the right part of the interface holds a panel with several tabs to interact with the plots:

  • The tab Settings provides options specific to each plot, such as hiding or displaying elements of the plot, modifying some elements, or changing axes scales and limits.
  • The tab Stratify can be used to select one or several covariates for splitting, filtering or coloring the points of the plot.

Stratification: split, color, filter

The Stratify tab allows to create and use covariates for stratification purposes. It is possible to select one or several covariates for splitting, filtering or coloring the data set or the diagnosis plots as exposed on the following video.





The following figure shows a plot of the concentration from the warfarin dataset, stratified by coloring individuals according to the continuous covariate wt: the observed data is divided into three groups, which were set to equal size with the button rescale. It is also possible to set groups of equal width, or to personalize the bins.

Moreover, clicking on a group on the right side panel highlights only the individuals belonging to this group, as can be seen below:

Values of categorical covariates can also be assigned to new groups, which can then be used for stratification.

Layout

The layout can be modified with buttons on top of each plot to select the number of plots (green highlight below), to arrange the layout (purple highlight below) and to remove the settings frame (pink highlight).

The layout button (second  button) can be used to select predefined layouts for the subplots, for instance all subplots in one row or all subplots in one column.

 

4.Data vizualization

Purpose

There are several representations depending on the type of data under consideration.

Outcome visualization

The representation of the outcome w.r.t. time is proposed in the Dataviewer frame.

It is possible to play with the axes to have a log-scale display as on the following:

An interesting feature is the possibility to display the dosing times as on the figure below. In the proposed example (PKVK_project of the demos), the individual dosing time of the individual is displayed when the user hovers over an individual’s data.

Informations are also provided with:

  • The total number of subjects
  • The average number of doses per subject
  • The total, average, minimum and maximum number of observations per individual.

In addition, if we split the graphic based on a covariate, the informations adapt to the subplots:

In case of several continuous outputs, one can plot one outcome w.r.t. another one as on the following figure for the warfarin data set. The direction of time is indicated by the red arrow.

For discrete outcomes, it is possible to display the outcome both as continuous outcomes or stacked as on the following figure corresponding to the Zylkene data set.

For time-to-event data, one can see the empirical Kaplan-Meier plot of the first event as on the following example of the length of hospital stay for cardiovascular patients.

Covariate display

It is possible to display one covariate vs another one. In the following figure, we display the age versus the wt and show the correlation coefficient as an information.

We can also display categorical covariates w.r.t. an another categorical covariate as an histogram (stacked or grouped), 

and a continuous covariate w.r.t. a categorical covariate as a boxplot as on the following example.

Suggest Edit