Hello!
I am in the process of porting some of my R code over to python. At the beginning of my process I connect to a Microsoft SQL Server database, do some data manipulation with dplyr, and then pull it into memory for some computation that cannot be done server side.
Example:
library(dplyr)
tbl(con, "my_table") %>%
filter(x > 1) %>%
mutate(y = x + z) %>%
collect()
Is there anything in python that does something similar? A lot of the resources I am looking at cause the data to get read into memory instantly and then run the compute. I am currently using pyodbc to connect to the DB. I have looked a bit into sqlalchemy and ibis, but I have not had successful results as of now. I looked into siuba as well, but they currently do not support SQL Server.
If anyone knows of a package and method for achieving dplyr-like syntax along with server-side computation I would greatly appreciate any information and possibly some code examples to study.
Thanks!