using tibble with several models

I did a try and sure it correspond broadly to your first code. However, i don't find a match between the summary table and the five sub table for one ony figure : 56.7 (I find 51.25?!).
Thanks a lot.

data |>
mutate(
input_M_numeric = input_M |> str_remove("%") |> as.numeric(),
M_binned = cut(input_M_numeric, breaks = c(0,20,80, 100, inf), right = FALSE),
T_binned = cut(input_T, breaks = c(0, .1, .5, 1, Inf), right = FALSE)
) |>
summarize(mean_X = mean(output_X),
.by = c(M_binned, T_binned)) |>
pivot_wider(id_cols = T_binned,
names_from = "M_binned",
values_from = "mean_X",
values_fill = 0)

You're right, there is a difference between averaging per model first, or across all models:

For [20,80) [0.5,1), we have

m2   35
m4  125
m4   10

so the total mean is: (35+125+10)/3 = 56.7, while averaging each model first gives: ( 35 + (125+10)/2 )/2 = 51.25

You have to decide which one is the correct result for you. The summarize(.by = c(M,T) I gave averages across all models without grouping (giving 56.7), if you want to first average by model, then across models, you can reuse data_factors, to get 51.25:

data_factors |>
  summarize(mean_by_model = mean(output_X),
            .by = c(M_factor, T_factor, Models)) |>
  summarize(mean_across_models = mean(mean_by_model),
            .by = c(M_factor, T_factor))
1 Like

Thanks a lot, it's bright.
-I thought it was equivalent as mean is a linear function but i 'm wrong. I think the right result is 51.25.
Please let me try your mean_across_models and tell you.

Hello,
I think what you did was perfect. I tried to plot a 3-D with (X = input_M M, Y = input_M T, Z = output_X )
but it seems that ggplot does not allow a 3D plot. I felt frustrated...

data |>
mutate(
input_M_percent = input_M |> str_remove("%") |> as.numeric(),
input_M_2 = input_M_percent / 100
) |>
ggplot(aes(
x = input_M_2,
y = output_X,
color = Models,
linetype = Models
)) +
geom_jitter(alpha = 0.10) +
geom_smooth(method = lm, formula = y ~ splines::bs(x, 3), se = FALSE)

You might want to look at { gg3D}.

Thank you ! unfortunately my r version does not support it. It 's strange that it so complicated to do a three dimension plot !

My apologies. I failed to notice that it is no longer maintained!

In the interactive plotting area there is Duncan Murdoch's {rgl} package. I've never used it but it is up-to-date-and maintained.

Thank you. In case you find a way to use ggplot for 3D, please let me know

The difficulty is that a 3D plot can not just be plotted in 2D, so you don't have a direct aesthetic mapping possible in ggplot. You usually need interactivity, which is not provided by just ggplot.

In the past I've used plotly, which is quite easy to use for simple graphs, much more complicated for more complex graphs.

rgl also works fine, I've rarely used it.

Thanks a lot for your suggestions, please let me try.
However, I think i did a confusion about 2d and 3d.

What i want is to use ggplot for plotting two variables (M,T) according to a third variable (X) that is the output.
I know how to do it with M according to X but not with (M,T) according to X.

Okay, this is different. I suspect you simply need to transform the data from wide to long format. It probably would be easiest if you supplied us with some sample data.

A handy way to supply data is to use the dput() function. Do dput(mydata) where "mydata" is the name of your dataset. For really large datasets probably dput(head(mydata, 100).

Data reshaping: from wide to long and back seems to give a good example of this using {tidyverse}, actually I think it is using the {tidyr} package which is a component of {tidyverse}.

Thanks a lot. I think it's doable. Please note that the sample are in this thread ( input_M input_T output_X) but i can brings real data to do the plot i expect. I follow your advice and come back tomorrow with real data sample.


Hello

I come with the sample of few points to try to run a beauty 3d plot if you can help me please.
Y axis = vertical position & X/Z axis = horizontal position

Thanks a lot !

dput(sample)
structure(list(xvar = c(85, 87.1052631578947, 89.2105263157895,
91.3157894736842, 93.4210526315789, 95.5263157894737, 97.6315789473684,
99.7368421052632, 101.842105263158, 103.947368421053, 106.052631578947,
108.157894736842, 110.263157894737, 112.368421052632, 114.473684210526,
116.578947368421, 118.684210526316, 120.789473684211, 122.894736842105,
125), yvar = c(0.369418963297467, 0.323099032122166, 0.279840686657854,
0.240021947632935, 0.2038893768478, 0.17154058185961, 0.142922392239015,
0.117845762330073, 0.0960118628491438, 0.0770288524421301, 0.0604760242732572,
0.0457736690859025, 0.0319835374704113, 0.013989515262266, 0.000111871201558973,
1.17176600352632e-06, 4.67598386797433e-05, 3.70050214571936e-05,
9.50270879381795e-05, -6.93185804440226e-05), zvar = c(0.25,
0.289473684210526, 0.328947368421053, 0.368421052631579, 0.407894736842105,
0.447368421052632, 0.486842105263158, 0.526315789473684, 0.565789473684211,
0.605263157894737, 0.644736842105263, 0.684210526315789, 0.723684210526316,
0.763157894736842, 0.802631578947368, 0.842105263157895, 0.881578947368421,
0.921052631578947, 0.960526315789474, 1)), class = "data.frame", row.names = c(NA,
-20L))

Assuming I have understood correctly, hero are two ways to to this.

The first is done using [data.table} and the second is done using [tidyr}

suppressMessages(library(data.table))
suppressMessages(library(tidyverse))

dat1 <- structure(list(xvar = c(85, 87.1052631578947, 89.2105263157895,
91.3157894736842, 93.4210526315789, 95.5263157894737, 97.6315789473684,
99.7368421052632, 101.842105263158, 103.947368421053, 106.052631578947,
108.157894736842, 110.263157894737, 112.368421052632, 114.473684210526,
116.578947368421, 118.684210526316, 120.789473684211, 122.894736842105,
125), yvar = c(0.369418963297467, 0.323099032122166, 0.279840686657854,
0.240021947632935, 0.2038893768478, 0.17154058185961, 0.142922392239015,
0.117845762330073, 0.0960118628491438, 0.0770288524421301, 0.0604760242732572,
0.0457736690859025, 0.0319835374704113, 0.013989515262266, 0.000111871201558973,
1.17176600352632e-06, 4.67598386797433e-05, 3.70050214571936e-05,
9.50270879381795e-05, -6.93185804440226e-05), zvar = c(0.25,
0.289473684210526, 0.328947368421053, 0.368421052631579, 0.407894736842105,
0.447368421052632, 0.486842105263158, 0.526315789473684, 0.565789473684211,
0.605263157894737, 0.644736842105263, 0.684210526315789, 0.723684210526316,
0.763157894736842, 0.802631578947368, 0.842105263157895, 0.881578947368421,
0.921052631578947, 0.960526315789474, 1)), class = "data.frame", row.names = c(NA,
-20L))

DT <- as.data.table(dat1)
# Reshape with data.table -------------------------------------------------

DT1 <- melt(DT, id.vars = "yvar")
ggplot(DT1, aes(yvar, value, colour = variable )) + geom_line()


# Reshape with tidyr ------------------------------------------------------
dat2 <- dat1 |>  pivot_longer(cols = !yvar)
ggplot(dat2, aes(yvar, value, colour = name )) + geom_line()

Hello, Thank you for your response.
I'm sorry if was not clear, the question was not about plotting a 2D with three variables but a 3D with one Y point for two X,Z points.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.