i have imported a CSV dataset into r as a dataframe
As u can see th df have 43018 observations BUT when i opens (view) the data and scroll down i has 104139 observations.
the dateframe have 4 variables (les call them x1,x2,x3,x4) and im looking for a specific number in column x1 so I use :
which(df$x1 == 457)
but the output is :
integer(0)
even though i can find this number when i opens the dataframe and searrch for it
It is really hard to see what is happening in your case as we have no data. As you can see in my mini example I am able to get the right row to show after filtering and can do it with other assignment as well.
your issue is probably row.names. You might drop the rownames, or make them a real column which records the original row number they had come from, (but not their current row number)
recreation of your problem
#example data
(hiris <- head(iris))
#mess with the row.names
row.names(hiris) <- 1001:1006
#check it
hiris
View(hiris)
# simple fix
row.names(hiris) <- NULL
#check again
View(hiris)
why is those to values showed in the image not equal? in the global environment it says 43.018 observation and when i opens it it shows 104.139 observations
thanks again
You can clearly see on the screenshot you posted that the row numbers on the left are not continuous. You have 43018 actual rows in your data but the rownames are associated with an original ID from a previous action or operation.
It depends on what the goal is. I try and only apply rownames at the very end of all my operations and for specific reasons (like wanting to create a table etc). Typically, it is just better to create your ID or name column as a actual column and not as rownames.
if you deleted the rownames, they wont be there to be deleted subsequently. In other words, whether you should remove the row.names will depend on whether the data.frame in question had row.names.
row.names have gone out of fashion, I rarely encounter them in modern workflows.