Hi,
I'm trying to run R on a virtual machine through work, but it is running very slowly.
I was running code on a physical machine, but I needed more RAM than was available. I was able to have a new virtual machine created for me with more RAM, which has overcome the initial problem. Only issue is... everything seems to take twice as long to run on this VM! It has a newer version of R, RStudio, and what I think should be better hardware.
As an example, I need to load 13 million rows into R from SQL....
Here's trying to load just 100,000:
library(RODBC)
loadSQL <- function() {
dbhandle <- odbcDriverConnect('driver={SQL Server};server=server;database=database;trusted_conection=true');
sql <- "SELECT
TOP 100000
*
FROM
database.schema.table
"
data2 <- sqlQuery(dbhandle,sql)
}
system.time({
loadSQL()
})
On the physical machine (Intel Core i5-8250U @ 1.60 GHz 4 cores 8 logical; 8GB RAM; Windows 10 Pro; RStudio 1.2.1335; R 3.6.1 64bit)
user system elapsed
3.62 0.13 8.41
On the VM (Intel Xeon ES-2670 @ 2.60GHz 8 virtual processors; 64GB RAM; Windows 10 Pro; RStudio 2021.09.1; R 4.1.2 64-bit ).
user system elapsed
15.63 0.17 15.95
Even loading libraries seems to be excruciating on the VM!
I'd appreciate any pointers!
Tim
——
Edit:
Our admin kindly tried allocating 16 rather than 8 CPUs.
No difference to the query time.
He then tried giving me dedicated rather than shared CPUs. Also no difference.
He confirmed he is using HyperV not VMWare.
I also tried downgrading to R3.6 and RStudio 1.2, to remove a few more variables from the comparison. Also no difference to the execution time.
Help!