Building neural network to predict rank via text

omirza · May 14, 2021, 12:39pm

Hey guys, so I am building a neural network for sentiment analysis. Basically I have a json file I convert to a data frame. There are 3 columns. 1 is the rank 1-5. 2 is review text which is words reviewing a product. 3. Is a summary of the review text. I need a loss and accuracy function to predict this. I have tried to tokenize my data and create a vocabulary list then run it through a keras_model to compile and plot. If anyone could help me debug my code, that would be awesome.

num_words <- 10000
max_length <- 50
text_vectorization <- layer_text_vectorization(
  max_tokens = num_words,
  output_sequence_length = max_length,
)
 
text_vectorization %>%
  adapt(train$reviewText)
 
get_vocabulary(text_vectorization)
  text_vectorization(matrix(train$reviewText[1], ncol = 1))
 
 
  input <- layer_input(shape = c(1), dtype = "string")
 
  output <- input %>%
   text_vectorization() %>%
    layer_embedding(input_dim = num_words + 1, output_dim = 16) %>%
   layer_global_average_pooling_1d() %>%
    layer_dense(units = 16, activation = "relu") %>%
    layer_dropout(0.5) %>%
    layer_dense(units = 1, activation = "sigmoid") %>%
    input_shape = train_matrix.shape[1]
 
  model <- keras_model(input, output)
 
  
  model %>% compile(
    optimizer = 'adam',
    loss = 'binary_crossentropy',
    metrics = list('accuracy')
  )
  summary(model)
  history <- model %>% fit(
    train_matrix,
        epochs = 10,
    batch_size = 5,
    validation_split = 0.2,
  )
 
plot(history)

julia · May 14, 2021, 12:58pm

Hello there, @omirza! You might take a look at our soon-to-be-published book Supervised Machine Learning for Text Analysis in R, which is all about how to approach problems like this. The first third of the book is about how to build features for ML, and the last third is about neural networks for text. It's difficult to give more specific advice in your situation more info on your data, except that I would look at this regression case study for some tips since you probably want to treat that outcome as continuous, I expect.

Good luck!

system · June 4, 2021, 12:58pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.