Please somebody help me with a data frame

I have this data frame with information about flights.
in some cases different types of planes make flights to the same destination city.
¿How could I know how many types of aircraft go to the same destination and how many times?

|Destination City|Country Destination|Aircraft| Flights Number|
|CARACAS|VENEZUELA|319|4|
|CARACAS|VENEZUELA|320|1|
|CURACAO|ANTILLAS HOLANDESAS|319|1|
|FORT LAUDERDALE|ESTADOS UNIDOS|B76|13|
|FORT LAUDERDALE|ESTADOS UNIDOS|A329|14|
|FORT LAUDERDALE|ESTADOS UNIDOS|A319|8|
|WASHINGTON|ESTADOS UNIDOS|319|1|
|WASHINGTON|ESTADOS UNIDOS|B760|2|
|WASHINGTON|ESTADOS UNIDOS|3767|22|
|WASHINGTON|ESTADOS UNIDOS|319|20|
|WASHINGTON|ESTADOS UNIDOS|320|22|
|WASHINGTON|ESTADOS UNIDOS|319|22|
|WASHINGTON|ESTADOS UNIDOS|B767|22|
|WASHINGTON|ESTADOS UNIDOS|319|21|
|WASHINGTON|ESTADOS UNIDOS|319|22|

Thanks!!!

Hi @santiagomorales, what have you tried so far?

Hi @dromano have tried subset and group by to extract the conditions that I need.
I'm actually new to R and I'm taking my first steps on this - Soy muy nuevo manejando R, lo estoy aprendiendo y quiero dar mis primeros passos
Thanks

For folks to be able to help you more easily, it would be good for you to post your data and code in an easy-to-copy format. Let's start with the code you've tried -- could you post it like this?

```
<---- paste your code here and include the ``` before and after
```

For now, less is better, so it would good to keep the code to at most 20 lines.

This its

arrange(DataBaase_NAL2018_MASEMISS)
str(DataBaase_NAL2018_MASEMISS %>% group_by(`Ciudad Destino`))
DestAndEquipoAndVuelos <- DataBaase_NAL2018_MASEMISS %>%
+ group_by(Ciudad Destino, Tipo de Equipo, Numero de Vuelos) %>%
DataBaase_NAL2018_MASEMISS %>%
+ group_by(1, 2, 3) %>%

These are my attempts, I understand if it is difficult to help me, in any case thanks.

Great! This is a good start, now let's work on the data: Could you post it here like this?

```
<--- paste output of dput(head(DataBaase_NAL2018_MASEMISS, 20)) here
```

When you run the dput() command, the output will appear in the console, which is usually in the lower left pane of RStudio.

This is the result of dput(head(DataBaase_NAL2018_MASEMISS, 20))
¿its Ok?
Thanks

> dput(head(DataBaase_NAL2018_MASEMISS, 20))
structure(list(`Ciudad Destino` = c("BUENAVENTURA", "CARTAGENA", 
"BUCARAMANGA", "CALI", "EL YOPAL", "CALI", "CAREPA", "CALI", 
"CARTAGENA", "MEDELLIN", "RIONEGRO - ANTIOQUIA", "MANIZALES", 
"PEREIRA", "BUCARAMANGA", "BUENAVENTURA", "SANTA MARTA", "SAN ANDRES - ISLA", 
"BUCARAMANGA", "CALI", "CARTAGENA"), `Tipo de Equipo` = c("C560", 
"C560", "C560", "H25B", "C560", "H25B", "C560", "H25B", "C560", 
"H25B", "C560", "C560", "C560", "C560", "C560", "C560", "H25B", 
"C560", "H25B", "C560"), `Número de Vuelos` = c(1, 2, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1), Distancia = c(331.25, 
652.59, 288.07, 280.16, 206.12, 280.16, 446.71, 280.16, 652.59, 
232.47, 215.38, 151.53, 178.02, 288.07, 331.25, 709.73, 1205.47, 
288.07, 280.16, 652.59)), row.names = c(NA, -20L), class = c("tbl_df", 
"tbl", "data.frame"))
1 Like

Is this what you need?

DataBaase_NAL2018_MASEMISS <- tribble(
  ~ DestCity, ~ CountryDest, ~ Aircraft, ~ FlightsNumber,
'CARACAS', 'VENEZUELA', '319', 4, 
'CARACAS', 'VENEZUELA', '320', 1, 
'CURACAO', 'ANTILLAS HOLANDESAS', '319', 1, 
'FORT LAUDERDALE', 'ESTADOS UNIDOS', 'B76', 13, 
'FORT LAUDERDALE', 'ESTADOS UNIDOS', 'A329', 14, 
'FORT LAUDERDALE', 'ESTADOS UNIDOS', 'A319', 8, 
'WASHINGTON', 'ESTADOS UNIDOS', '319', 1, 
'WASHINGTON', 'ESTADOS UNIDOS', 'B760', 2, 
'WASHINGTON', 'ESTADOS UNIDOS', '3767', 22, 
'WASHINGTON', 'ESTADOS UNIDOS', '319', 20, 
'WASHINGTON', 'ESTADOS UNIDOS', '320', 22, 
'WASHINGTON', 'ESTADOS UNIDOS', '319', 22, 
'WASHINGTON', 'ESTADOS UNIDOS', 'B767', 22, 
'WASHINGTON', 'ESTADOS UNIDOS', '319', 21, 
'WASHINGTON', 'ESTADOS UNIDOS', '319', 22 )

DataBaase_NAL2018_MASEMISS %>% 
  group_by(DestCity, CountryDest, Aircraft) %>% 
  summarise(n = sum(FlightsNumber))

Perfect. Now folks can more easily help with your questions, so let's get to those: In your original post, it sounded like you had more than one question you wanted to answer -- could you make them explicit? And does @ap53's solution answer one of them?

In my previous reply I forgot one line. I should have put the following before the first line:

library(tidyverse)
1 Like

I made a data frame with a database of domestic flights 2018 from a specific airport
After selecting the data you needed to work leave the variables ( Ciudad Destino - Tipo de Equipo - Número de Vuelos - Distancia) this last Distancia is not that important for what I want to do.
They are data from different airlines that operate at the same airport, therefore the variables "Ciudad Destino y Tipo de Equipo" are repeated many times. i hope to get is the following:

How many flights ("Número de Vuelos") of different types of aircraft ("Tipo de Equipo") went to the same city ("Ciudad Destino")
For example the A320 ("Tipo de Equipo") went to CALI ("Ciudad de Destino) 10 times ("Numero de Vuelos) but also the B767 7 times and the A319 20 times.

The @ap53 solutions Repeat the destination several times and I only look for the destination once and every time that different planes travel there.

I hope I was clear

Thanks a lot

So A320 would be one column, B767 another column, etc ? You would also have columns for planes that never flew to CALI.

Yes, one column per type of plane would be very good and, if possible, to different destinations those planes flew. those who did not fly to CALI would not be necessary

Like this ?

DataBaase_NAL2018_MASEMISS %>% 
  group_by(DestCity, CountryDest, Aircraft) %>% 
  summarise(n = sum(FlightsNumber)) %>% 
  pivot_wider(names_from = Aircraft, values_from = n, 
              values_fill = list(n = 0))

The values_fill parameter replaces the NAs for the planes thet never flew to a city by a 0.

1 Like

It might be good to have a sense of the structure you're looking for in your ideal table -- you can use the tribble() function, like @ap53 did, to make a 'toy' version of what you'd like your ideal table to look like

tribble(
  ~ destination, ~ no_of_flights, ~ aircraft,
  'cali', 2, 'B767',
  'cali', 3, 'A310'
)
1 Like

it worked!! thank you for your patience

Hi @santiagomorales, I think you checked my post as the solution by mistake -- could you check the post by @ap53 that worked for you, instead?

Hi the @ap53 solution's is correct

Thank a lot! :slight_smile:

In that case, could you go to that post and mark it 'Solution'? Right now my post is marked.

Sure!! Do it, thanks!

1 Like