This documentation is for Monolix Suite 2016R1.
Datxplore is a graphical and interactive software for the exploration and visualization of data. Data set under consideration are for population modeling as defined here. Datxplore provides various plots and graphics (box plots, histograms, survival curves…) to study the statistical properties of discrete and continuous data, and to analyze the behavior depending on covariates, individuals, etc.
The graphical interface comprises three parts in addition to the menu.
In the menu, the user can find
- Project: Create, load or save a project. Export graphic(s) to .png, .bmp or .jpg format.
- Tools: Access a window used to modify covariates. Set data to log on Y-Axis if possible.
- Help: About. License.
Two lists take inventory of the data from the user data set. According to the selected Y-Axis element, the X-Axis list is updated. Once an element of each list is selected, the graphics are created. See Data selection section.
Used to select, filter or split data. See Data manipulation section.
In the Graphics frame, the user can see all the possible graphics. See Data visualization section.
The goal is to define the variable under consideration on the dataset. One can look at observations, covariates and categories. This is specified when the user defines each column type in the data set as in the following picture.
Notice that Datxplore often provides an initial guess of the type of the column depending on the name.
Possible types for Datxplore
The user can choose in the following list for the column’s type
- ignore: corresponds to a column in the data set the user wants to ignore.
- id: corresponds to the id of the subject.
- time: corresponds to the time.
- y: corresponds to the observations.
- yType: corresponds to the type of the observations. This is useful only when several types of observations are present in the data set. In the example of a PKPD data set, two observations are in the data set, the PK and the PD respectively. To differentiate the observations, yType equals 1 for the first observation, and 2 else wise.
- cov: corresponds to a continuous covariate, the weight for example.
- cat: corresponds to a categorical covariate, the gender for example.
- reg: corresponds to a regressor of the data set, the outdoor temperature for example.
- date: corresponds to the date. This can be combined with the time to have another definition of the time.
- mdv: corresponds to missing dependent values.
- amt: corresponds to the amount of drug.
The name proposed in the figure and in the data choice is the one defined in the label. The user can modify it. By default, the label used is the one defined in the data set.
There are three types of observations
- continuous: The observation is continuous with respect to time. For example, a concentration is a continuous observation.
- categorical: The observation values takes place in a finite categorical space. For example, the observation can be a categorical observation (an effect can be observed as low, medium, high) or a count observation over a defined time (the number of epileptic crisis in a defined time).
- event: The observation is an event, for example the occurring of an epileptic crisis.
To specify it, the user can do it in the interface as the following figure
Notice that for multiple outputs, the user shall define all the names. By default, these are names y_1, y_2, …
There are various ways to visualize data. The data can be filtered, split or displayed by individual.
First, the data can be seen altogether or just based on an ID.
- All: When this feature is chosen, all individuals are displayed.
- Selection: By enabling this feature, Datxplore displays data only by individuals.
This feature is used to select only a part of one or several covariates (discrete or continuous). Each group of continuous covariate or each modality of discrete covariate can be chosen to be used or not in graphics.
In the warfarin example, the user can filter on the gender looking only at the male (gender equals 1) and filter the age under 32 as follows. The filter is done as follows in the left frame
and the following graphic comes
It is also possible to split a graphic according to one or several covariates. The picture below is the result of a split on gender.
It is also possible split over several covariates and also to combine filter and split.
If the groups of covariates used in filter and split are not adequate for the study, they can be modified by clicking on the menu bar Tools->Covariate transformation.
For continuous covariates, groups can be added or removed by clicking on the Add button. Then, values of bounds can be set as on the proposed example.
For discrete covariates, modalities can be formed into different groups which behave themselves as modalities as follows
There are several representations depending on the type of data under consideration. Six kinds of graphics are available depending on the selected data in the list of Y-Axis and X-Axis.
The spaghetti plot represents Y w.r.t. X by individual when both X and Y are continuous variable. This is one of the most used graphics. One can see for example a continuous observation with respect to time which is the first graphic of interest. In the warfarin example, one can plot the concentration w.r.t. time as in the following figure
The scatter plot represents dots of Y versus X when both X and Y are continuous variable and X is a covariate. In the warfarin example, one can plot the concentration w.r.t. the weight as in the following figure
Box & Whiskers Plot
The box blox represents the statistical representation of a continuous variable in box plot w.r.t. a discrete variable. It is a convenient way to graphically depict groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes (whiskers) indicating variability outside the upper and lower quartiles. In the warfarin example, one can plot the distribution of the weight w.r.t. the gender as in the following figure
It is very relevant to see the variation of an observation or a continuous covariate.
The cumulative histogram represents a cumulative count of a discrete variable w.r.t. a continuous variable. In the warfarin example, one can plot the cumulative count of the gender w.r.t. the weight as in the following figure
Histogram by group
The histogram plot represents a cumulative count of a discrete variable w.r.t. a discrete variable. In the warfarin example, one can plot the cumulative count of the gender w.r.t. the age as in the following figure
(Kaplan-Meier estimator and events mean)
Links between graphics and combinations of variables
Here is a summary table that shows which plot is used for each combination of X and Y data.
|Continuous observation||Continuous observation||Spaghetti|
|Continuous observation||Continuous covariate||Scatter|
|Continuous observation||Discrete covariate||Box plot|
|Discrete observation||Time||Cumulative histo|
|Discrete observation||Regression||Cumulative histo|
|Discrete observation||Continuous covariate||Cumulative histo|
|Discrete observation||Discrete covariate||Histo. by group|
|Continuous covariate||Continuous covariate||Scatter|
|Continuous covariate||Discrete covariate||Box plot|
|Discrete covariate||Continuous covariate||Cumulative histo.|
|Discrete covariate||Discrete covariate||Histo. by group|