Hello!
I have built a model in keras and deployed it through Google Cloud ML, using the cloudml
package
The model is trained on numerical inputs which have been converted from strings
When It comes to making predictions, I convert the strings to numeric using a lookup table and then pass the numerical input to the model
This is simple to do on the local machine:
library(tidyverse)
library(cloudml)
# lookup table
lookup <- tibble(int = c(1, 2, 3),
str = c('A1', 'B1', 'C1'))
# input strings
a <- 'A1'
b <- 'B1'
# convert to numeric
a_ <- lookup %>% filter(str == a) %>% select(int) %>% pull()
b_ <- lookup %>% filter(str == b) %>% select(int) %>% pull()
# send to deployed model and receive predictions
cloudml_predict(
instances = list(c(a_, b_)),
name = "size_predictor",
version = "a_1",
verbose = T
)
However when it comes to full deployment, I can't work out where I need to put the lookup table. Ideally my website will send an input file to cloudML, containing strings. These strings will be converted to integers through the lookup table and be fed to the model. The model will then return the outputs
My question is, what is the best way method of creating this lookup step? Do I need to add another layer to the keras model at the beginning to do the conversion? Should I store the lookup table in BigQuery and divert inputs through this beforehand?
The answers I have found so far apply only to python, for example this on stack overflow: Add Tensorflow pre-processing to existing Keras model (for use in Tensorflow Serving)
I am trying to have a full end to end (database -> deployment) project in R, I feel I am so close but there is just this gap left!